Categories
Misc

Confused between gradient of vector and scalar.

Lets assume f is our NN. Individual data points are (x, y) and batch data is (X, Y)

  • y = f(x), then I’ll take gradient of y with respect to every parameter of NN, this I’ll represent as g.
  • Y = f(X), this is vectorization. Now if I take gradient of Y with respect to params of NN, then I’ll get G.

What is the relation between G and g? Whether G is average of all g in that batch or something else.

For the context, I am facing this difficulty while implement the policy gradient (reinforcement learning) algorithm. In the policy gradient we have to average over some of the gradients of the policy function. The confusion is that should I do that for individual states or should I use batch of states, because for both the cases, the gradients are of same dimensions.

submitted by /u/Better-Ad8608
[visit reddit] [comments]

Categories
Misc

Are the gradients averaged out during batch learning process?

Suppose I have a simple 4 layered NN. I want to train this to recognize pattern in this data. I have two ways to train this:

  1. I pass one data sample, calculate the loss, calculate the gradients for that loss with respect to the parameters (weights and biases) of neural network, adjust the parameters of NN and repeat this until loss is minimized.
  2. I pass batches of data, and do the rest as mentioned above. This is called vectorization.

So my question is does the gradients are averaged out between all the samples in a batch?

submitted by /u/Better-Ad8608
[visit reddit] [comments]

Categories
Misc

Using tensorflow for time series forecasting on datasets with values that are both positive and negative at different points in time. What are best practices to pre process this type of data and what layer structure would you recommend?

submitted by /u/bitbyt3bit
[visit reddit] [comments]

Categories
Misc

Tensorflow lite Model Maker

I was wondering ,when creating an image classifer with tflite-model-maker, what is the optimal size for the images in the datasets . Thanks 🙂

submitted by /u/DrakenZA
[visit reddit] [comments]

Categories
Misc

Incompatible Shapes due to Batch Size

Hi all. I am trying to train a neural net to perform handwritten character recognition, and have attached the relevant code below. I’m training 28×28 size characters with the EMNIST dataset. No matter how what I change the batch size to, when I try to train the model, I always get a similar error at the bottom. Does anyone know how to fix this? Thank you for the help!

def load_dataset(): X_train = X_train.reshape((X_train.shape[0], 28, 28, 1)) X_test = X_test.reshape((X_test.shape[0], 28, 28, 1)) y_train = to_categorical(y_train) y_test = to_categorical(y_test) return X_train, y_train, X_test, y_test def prep_pixels(train, test): train_norm = train.astype('float32') test_norm = test.astype('float32') train_norm = train_norm/255.0 test_norm = test_norm/255.0 return train_norm, test_norm def define_model(): model = Sequential() model.add(Conv2D(32, (3,3), activation = 'relu', kernel_initializer = 'he_uniform', input_shape = (28, 28, 1))) model.add(MaxPooling2D((2,2))) model.add(Flatten()) model.add(Dense(100, activation = 'relu', kernel_initializer = 'he_uniform')) model.add(Dense(46, activation = 'softmax')) opt = SGD(learning_rate = 0.01, momentum = 0.9) model.compile(optimizer = opt, loss = 'categorical_crossentropy', metrics = ['accuracy']) return model def evaluate_model(dataX, dataY, xtest, ytest, n_folds = 5): model = define_model() history = model.fit(dataX, dataY, epochs = 4, batch_size = 64, validation_data = (xtest, ytest)) _, acc = model.evaluate(dataX, dataY, verbose=0) scores.append(acc) histories.append(history) 

​

 ValueError: Shapes (64, 1) and (64, 46) are incompatible 

EDIT: Fixed some bugs, but still have the same error

submitted by /u/landshark223344
[visit reddit] [comments]

Categories
Misc

Correct way to prune a model for faster CPU inference?

I have been following the tensorflow examples on how to set up a model for pruning and quantise it in order to improve inference. What I noticed however was:

1) the sparse model resulted from pruning has no faster inference benefits

2) the quantisation makes the model even slower (I know that this is probably due to TFlite not being optimised for x86).

What is the method you use to prune your models?

submitted by /u/ats678
[visit reddit] [comments]

Categories
Misc

ValueError: The first argument to `Layer.call` must always be passed

I have two classes and trying to build Tensorflow ranking model

when i run model.fit(cached_train, epochs=3), i get error

ValueError: The first argument to `Layer.call` must always be passed

​

​

​

class ProdRankingModel(tf.keras.Model):

def __init__(self):

super().__init__()

embedding_dimension = 32

self.user_embeddings = tf.keras.Sequential([

tf.keras.layers.StringLookup(

vocabulary=unique_user_ids, mask_token=None),

tf.keras.layers.Embedding(len(unique_user_ids) + 1, embedding_dimension)])

self.prod_embeddings = tf.keras.Sequential([

tf.keras.layers.StringLookup(

vocabulary=unique_items, mask_token=None),

tf.keras.layers.Embedding(len(unique_items) + 1, embedding_dimension)

])

# Compute predictions.

self.ratings = tf.keras.Sequential([

tf.keras.layers.Dense(256, activation=”relu”),

tf.keras.layers.Dense(64, activation=”relu”),

tf.keras.layers.Dense(1)

])

def call(self, inputs):

user_id, products = inputs

​

user_embedding = self.user_embeddings(user_id)

product_embedding = self.prod_embeddings(products)

return self.ratings(tf.concat([user_embedding, product_embedding], axis=1))

​

​

class ProductModel(tfrs.models.Model):

def __init__(self):

super().__init__()

self.prodranking_model: tf.keras.Model = ProdRankingModel()

self.task: tf.keras.layers.Layer = tfrs.tasks.Ranking(

loss = tf.keras.losses.MeanSquaredError(),

metrics=[tf.keras.metrics.RootMeanSquaredError()]

)

​

def call(self, features: Dict[str, tf.Tensor]) -> tf.Tensor:

return self.prodranking_model(

(features[“user_id”], features[“prod_name”]))

​

def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:

labels = features.pop(“prod_count”)

rating_predictions = self(features)

# The task computes the loss and the metrics.

return self.task(labels=labels, predictions=rating_predictions)

submitted by /u/TimelyAbbreviations1
[visit reddit] [comments]

Categories
Misc

Need some help with my first CNN for a multi-classification task

Hi! I’m building a CNN for the first time for a university project: the idea is to classify images coming from 10 animal classes (taken from ImageNet).

Anybody willing to give me some advice about my model? Here’s my model:

model = Sequential() model.add(layers.InputLayer(input_shape=(224,224,3))) model.add(layers.BatchNormalization()) model.add(layers.Conv2D(64,(3,3), activation='relu')) model.add(layers.BatchNormalization()) model.add(layers.Conv2D(64,(3,3),activation='relu')) model.add(layers.BatchNormalization()) model.add(layers.Conv2D(128,(3,3),activation='relu')) model.add(layers.BatchNormalization()) model.add(layers.Conv2D(256,(3,3),activation='relu')) model.add(layers.AvgPool2D(2,2)) model.add(layers.Flatten(name='features_layer')) model.add(layers.Dense(10, activation='softmax')) 

Training Loss keeps decreasing and training accuracy keeps improving (and even converges to 1) but after ~10 epochs I’m stuck around 0.5 validation accuracy. The actual dataset contains 2500 images, and I tried both with a 0.2 validation split and 0.3. Using 0.2 leads to more stable results, but reach 0.5 val_accuracy more slowly. I can try augmenting my dataset (creating some slightly modified copies of the images I have), but the training time increases quite a lot, and I would like to be reasonably sure that my model is good (at least on theory) before training with high number of epochs and new expanded dataset.

Is the overall structure of my net correct? Some questions:

  • Should the number of convolutional layers and the number of kernel used in each be set ‘by heart’? I’ve seen CNNs with very different architectures when comes to number of Conv2D layers included, and I’m not getting if there’s some heuristics I should follow.
  • Should I use one batch normalization layer at the end of the convolutions, or should I go with a batch normalization after each Conv2D?

Sorry for the bunch of ‘noob’ questions, I hope you’ll understand my perplexity. As I said, it is my first time dealing with building a CNN myself and I feel that even real ‘general’ suggestions might help. I would really appreciate any advice c:

submitted by /u/synchro-azel
[visit reddit] [comments]

Categories
Misc

Looking for a proper model to train (with transfer learning on Tensorflow) on RGB Images and 3D-point data for hand pose estimation

submitted by /u/blevlabs
[visit reddit] [comments]

Categories
Misc

Does MacBook Air 16GB have enough power to build Tensorflow from the source?

I faced an error when I was building Tensorflow from the source with my laptop. It has only 8GB RAM and out of memory is the reason for the error. I’m going to buy a new laptop but I’m not sure how much memory Tensorflow requires. MacBook Air is one of the candidates. I would be happy if I know something about it. Thanks.

submitted by /u/SubstantialSwimmer4
[visit reddit] [comments]