How does Tensorflow calculate mean squared error under the hood (cannot reproduce with custom loop)

Hi all,

My question is linked to a question I asked recently: post

I need to loop over individual samples when training due to too large a batch size to hold in memory. I have had good success generating reproducible losses and and accumulated gradients with one of the training loops I am carrying out – and, applied gradients to weights are accurate (plus floating point errors) –

another custom loop I am carrying out on a batch is the mean squared error between a predicted label and the real label. Again, I need to iterate over the batch of samples manually due to a large batch size. To confirm it works, and I get the same losses and gradients, I am comparing my custom loop on a batch of 100 samples so i can compare both methods using ‘GradientTape()’

My code snippet is as follows: for batch training:

with tf.GradientTape() as tape:

value_loss = tf.reduce_mean((return_buffer – critic_model([degree_buffer, graph_adj_buffer, action_vect_buffer])) ** 2)

value_grads = tape.gradient(value_loss, critic_model.trainable_variables)

value_optimizer.apply_gradients(zip(value_grads, critic_model.trainable_variables))

for individual samples:

value_loss_tracking = []total_loss = 0train_vars_val = critic_model_individual.trainable_variablesaccum_gradient_val = [tf.zeros_like(this_var) for this_var in train_vars_val]for adj_ind, degree_ind, action_vect_ind, return_ind in zip(graph_adj_buffer, degree_buffer, action_vect_buffer, return_buffer_):adj_ind = adjacency_normed_tensor(adj_ind)degree_ind = tf.expand_dims(degree_ind, 0)action_vect_ind = tf.expand_dims(action_vect_ind, 0)

with tf.GradientTape() as tape:

ind_value_loss = tf.square(return_ind – critic_model_individual([degree_ind, adj_ind, action_vect_ind]))


total_loss += ind_value_lossgradients = tape.gradient(ind_value_loss,train_vars_val)

accum_gradient_val = [(acum_grad + grad) for acum_grad, grad in zip(accum_gradient_val, gradients)]

accum_gradient_vals_final = [this_grad / steps_per_epoch for this_grad in accum_gradient_val]policy_optimizer_ind.apply_gradients(zip(accum_gradient_vals_final, train_vars_val))

mean_loss = tf.reduce_mean(value_loss_tracking)

forgive the lack of indentation, but both loops work fine (in bold is the loss) – however, when I look at the loss in my custom loop relative to the mean squared error in the batch loop, the values are different starting sometimes from one decimal place – and they do not look like floating point errors to me. i.e. 0.43429542 and 0.4318762 – these seem really different to me to be floating point errors – in the other custom loop, i see floating points changing after about 5 decimal places… this is not the case here. sometime i will even see losses like 0.39 compared 0.40 – this seems not right to me. does anybody if this makes sense, or agree that this does not look right? I have tried np.mean and np.square also – I have looked at source code and cannot see exactly how Tensorflow does this under the hood!

any help is appreciated!

submitted by /u/amjass12
[visit reddit] [comments]

Leave a Reply

Your email address will not be published.