I’m building a tensorflow model which will have some variable sized inputs, with zero (or some other value) padding used to bring smaller inputs up to the standard input size. I also intend to use some sort of weight decay (L1/L2 regularization).
My concern is, that during training whenever padded input comes in, the weights leading out of the zero-inputs will continue to be decayed by whatever regularization I use. Ideally, I would like to disable my L1/L2 regularization on weights that have no gradient due to zero-inputs. Is there a way to get TF to do this? Disabling weight updates on those weights would also work.
If it helps, I can certainly pad with a value that doesn’t appear in the natural data anywhere, so that any occurrence with this value would indicate that weight updates should be masked. The layers will be convolutional.