In tensorflow tutorial on neural machine translation this
In loss_function () function they have masked loss on padded
tokken, but my question is won’t crossenteopy function itself
cancel out padded token loss term so why do masking
submitted by /u/AI_Astronaut9852
[visit reddit]
[comments]