Categories
Misc

How to encrypt a tflite model

Hi,I am trying to run tflite model on browser. It would run client side, I have converted the model to wasm format and am able to run it successfully on browser.

Since, It would be client side, The tflite model would be accessible to everyone. Is it possible to encrypt the model in anyway, So not everyone has access to the model?

The application is built using mediapipe framework, Not sure if it would change the solution.

Thanks!

submitted by /u/cvmldlengineer
[visit reddit] [comments]

Categories
Misc

M40 vs 2080S which is better?

So I found out that a tesla M40 has the same amount of cuda cores as a 2080S but 2080S has tensor cores and is a lot more expensive but M40 is a lot cheaper so which would be the best bang for buck?

Price 2080S: 700$ specs: cuda cores 3072 core clock 1815MHz ram 8GB mem clock 2000 (15.5 GB effective) tensor cores: 384

Price M40: 140$(used) specs: cuda cores 3072 core clock 1110MHz ram: 12GB mem clock 1502 (6GBs effective)

Comparing price I would think for coding rendering and AI M40 would be better if you got 2 but tell me what you guys think

submitted by /u/isaiahii10
[visit reddit] [comments]

Categories
Misc

New Public Workshops Now Available from the NVIDIA Deep Learning Institute

For the first time ever, the NVIDIA Deep Learning Institute (DLI) is making its popular instructor-led workshops available to the general public.

For the first time ever, the NVIDIA Deep Learning Institute (DLI) is making its popular instructor-led workshops available to the general public. 

With the launch of public workshops this week, enrollment will be open to individual developers, data scientists, researchers, and students. NVIDIA is increasing accessibility and the number of courses available to participants around the world. Now anyone can learn from world-class NVIDIA instructors in courses on AI, accelerated computing, and data science. 

Previously, DLI workshops were only available to large organizations that wanted dedicated and specialized training for their in-house developers, or to individuals attending NVIDIA GTC

Boost Your Skills with Industry Leading Training

Job growth in the tech industry continues and advanced software development skills in deep learning, data science, and accelerated computing are highly sought after. DLI workshops offer a comprehensive learning experience which includes hands-on exercises and guidance from expert instructors certified by DLI. Courses are delivered virtually and in many time zones to reach developers worldwide. In addition to English, many courses are offered in other languages including Chinese and Japanese. 

With the introduction of DLI workshops for individuals, NVIDIA is making it easier for anyone to access world-class training. Registration fees cover learning materials, instructors, and access to fully configured GPU accelerated development servers for hands-on exercises.

The current lineup of DLI workshops for individuals includes:

March 2021

  • Fundamentals of Accelerated Computing with CUDA Python
  • Applications of AI for Predictive Maintenance

April 2021

  • Fundamentals of Deep Learning
  • Applications of AI for Anomaly Detection
  • Fundamentals of Accelerated Computing with CUDA C/C++
  • Building Transformer-Based Natural Language Processing Applications
  • Deep Learning for Autonomous Vehicles – Perception
  • Fundamentals of Accelerated Data Science with RAPIDS
  • Accelerating CUDA C++ Applications with Multiple GPUs
  • Fundamentals of Deep Learning for Multi-GPUs

May 2021

  • Building Intelligent Recommender Systems
  • Fundamentals of Accelerated Data Science with RAPIDS
  • Deep Learning for Industrial Inspection
  • Building Transformer-Based Natural Language Processing Applications
  • Applications of AI for Anomaly Detection

Visit the DLI website for details on each course and the full schedule of upcoming workshops, which is regularly updated with new training opportunities.

A complete list of DLI courses are available in the DLI course catalog.

Register today for a DLI instructor-led workshop for individuals. Space is limited so sign up early. For more information, email nvdli@nvidia.com.

Categories
Misc

[D] : Any good resources for object detection using TensorFlow? Stuck on it!

submitted by /u/anotsohypocritesoul
[visit reddit] [comments]

Categories
Misc

Tensorflow pratice vs Mathematics

Hello there,

i often see the following code for regression problems(we have here a linear regression)

import tensorflow.compat.v1 as tf
import numpy as np
import matplotlib.pyplot as plt
learning_rate = 0.01
training_epochs = 100
x_train = np.linspace(-1, 1, 101)
y_train = 2 * x_train + np.random.randn(*x_train.shape) * 0.33
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
def model(X, w):
return tf.multiply(X, w)
w = tf.Variable(0.0, name=”weights”)
y_model = model(X, w)
cost = tf.square(Y-y_model)
train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
for epoch in range(training_epochs):
for (x, y) in zip(x_train, y_train):
sess.run(train_op, feed_dict={X: x, Y: y})
w_val = sess.run(w)
sess.close()
plt.scatter(x_train, y_train)
y_learned = x_train*w_val
plt.plot(x_train, y_learned, ‘r’)
plt.show()

But isnt that wrong? My problems are these lines:

for epoch in range(training_epochs):for (x, y) in zip(x_train, y_train):sess.run(train_op, feed_dict={X: x, Y: y})

Why is it a problem? Because if you look how we do it in pure mathematics it doesnt fit. We have the MSE function in math and we do gradient descent over the hole function. But here it seems that they are doing gradient descent just over the parts of MES function in line

“for (x, y) in zip(x_train, y_train):sess.run(train_op, feed_dict={X: x, Y: y})”

What do i mean with that? MSE=g1(x)+g2(x)+…+gn(x) and it seems like they do graph descent on g1(x) then on g2(x) and so on.How does TensorFlow exactly calculus in the back?
My problem is that through feed_dict={X: x, Y: y} only just one function will be called. Lets say x=1 and y=2 Tensorflow will go to X and Y then it will go to def model and only will call one part of the function of MSE lets say g1(x) but you need to go over all MSE with graph descent?

submitted by /u/te357
[visit reddit] [comments]

Categories
Misc

Creating a tensor using linspace function

Hello Guys,

I’m challenging myself to create simple 1 dimension tensor that consist of integers, range from 1-10 using the linespace function and with a shape of 6. However I haven’t been successful doing that. How do I fix this ?

My code:

[1,2,3,4,5,6,7,8,9,10]

torch.linspace(1, 1, 10)

submitted by /u/destin95
[visit reddit] [comments]

Categories
Misc

Why getting NaN values for custom Dice loss in Keras?

I am using Keras for boundary/contour detection using a Unet. When I use binary cross-entropy as the loss, the losses decrease over time as expected the predicted boundaries look reasonable

However, I have tried custom loss for Dice with varying LRs, none of them are working well.

smooth = 1e-6 def dice_coef(y_true, y_pred): y_true_f = K.flatten(y_true) y_pred_f = K.flatten(y_pred) intersection = K.sum(y_true_f * y_pred_f) return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth) def dice(y_true, y_pred): return 1-dice_coef(y_true, y_pred) 

the loss values don’t improve. That is, it will show something like

loss: nan - dice: .9607 - val_loss: nan - val_dice: .9631 

I get NaNs for the losses and values for dice and val_dice that barely change as the epochs iterate. This is regardless of what I use for the LR, whether it be .01 to 1e-6

The dimensions of the train images/labels looks like N x H x W x 1, where N is the number of images, H/W are the height/width of each image

can anyone help?

submitted by /u/74throwaway
[visit reddit] [comments]

Categories
Misc

Feelin’ Like a Million MBUX: AI Cockpit Featured in Popular Mercedes-Benz C-Class

It’s hard not to feel your best when your car makes every commute a VIP experience. This week, Mercedes-Benz launched the redesigned C-Class sedan and C-Class wagon, packed with new features for the next generation of driving. Both models prominently feature the latest MBUX AI cockpit, powered by NVIDIA, delivering an intelligent user interface for Read article >

The post Feelin’ Like a Million MBUX: AI Cockpit Featured in Popular Mercedes-Benz C-Class appeared first on The Official NVIDIA Blog.

Categories
Misc

Omniverse Assets Available for Download on TurboSquid

TurboSquid and NVIDIA are collaborating to curate thousands of USD models that are available today and ready to use with NVIDIA Omniverse.

TurboSquid and NVIDIA are collaborating to curate thousands of USD models that are available today and ready to use with NVIDIA Omniverse.

Many developers using Omniverse are experiencing enhanced workflows with virtual collaboration and photorealistic simulation. The open platform, which is available now in open beta, enables teams around the world to simultaneously collaborate in real time, using their favorite 3D applications. 

TurboSquid has an extensive library of 3D models that users can easily drag and drop into Omniverse, allowing them to immediately start collaborating with others. This helps developers save time as they can immediately start exploring Omniverse without worrying about importing or exporting content, model preparation, or polycounts. Users can load TurboSquid’s USD models in Omniverse connectors, and Omniverse ensures consistent quality between teams, contractors, and ecosystems. 

To get started, download the NVIDIA Omniverse Launcher from nvidia.com/omniverse. Run the Omniverse Launcher and install Omniverse Create or Omniverse View apps, then import TurboSquid 3D content and start creating.

Learn more by visiting TurboSquid’s Omniverse page, and check out the 3D tool sets now available.

Categories
Offsites

The Technology Behind Cinematic Photos

Looking at photos from the past can help people relive some of their most treasured moments. Last December we launched Cinematic photos, a new feature in Google Photos that aims to recapture the sense of immersion felt the moment a photo was taken, simulating camera motion and parallax by inferring 3D representations in an image. In this post, we take a look at the technology behind this process, and demonstrate how Cinematic photos can turn a single 2D photo from the past into a more immersive 3D animation.

Camera 3D model courtesy of Rick Reitano.

Depth Estimation
Like many recent computational photography features such as Portrait Mode and Augmented Reality (AR), Cinematic photos requires a depth map to provide information about the 3D structure of a scene. Typical techniques for computing depth on a smartphone rely on multi-view stereo, a geometry method to solve for the depth of objects in a scene by simultaneously capturing multiple photos at different viewpoints, where the distances between the cameras is known. In the Pixel phones, the views come from two cameras or dual-pixel sensors.

To enable Cinematic photos on existing pictures that were not taken in multi-view stereo, we trained a convolutional neural network with encoder-decoder architecture to predict a depth map from just a single RGB image. Using only one view, the model learned to estimate depth using monocular cues, such as the relative sizes of objects, linear perspective, defocus blur, etc.

Because monocular depth estimation datasets are typically designed for domains such as AR, robotics, and self-driving, they tend to emphasize street scenes or indoor room scenes instead of features more common in casual photography, like people, pets, and objects, which have different composition and framing. So, we created our own dataset for training the monocular depth model using photos captured on a custom 5-camera rig as well as another dataset of Portrait photos captured on Pixel 4. Both datasets included ground-truth depth from multi-view stereo that is critical for training a model.

Mixing several datasets in this way exposes the model to a larger variety of scenes and camera hardware, improving its predictions on photos in the wild. However, it also introduces new challenges, because the ground-truth depth from different datasets may differ from each other by an unknown scaling factor and shift. Fortunately, the Cinematic photo effect only needs the relative depths of objects in the scene, not the absolute depths. Thus we can combine datasets by using a scale-and-shift-invariant loss during training and then normalize the output of the model at inference.

The Cinematic photo effect is particularly sensitive to the depth map’s accuracy at person boundaries. An error in the depth map can result in jarring artifacts in the final rendered effect. To mitigate this, we apply median filtering to improve the edges, and also infer segmentation masks of any people in the photo using a DeepLab segmentation model trained on the Open Images dataset. The masks are used to pull forward pixels of the depth map that were incorrectly predicted to be in the background.

Camera Trajectory
There can be many degrees of freedom when animating a camera in a 3D scene, and our virtual camera setup is inspired by professional video camera rigs to create cinematic motion. Part of this is identifying the optimal pivot point for the virtual camera’s rotation in order to yield the best results by drawing one’s eye to the subject.

The first step in 3D scene reconstruction is to create a mesh by extruding the RGB image onto the depth map. By doing so, neighboring points in the mesh can have large depth differences. While this is not noticeable from the “face-on” view, the more the virtual camera is moved, the more likely it is to see polygons spanning large changes in depth. In the rendered output video, this will look like the input texture is stretched. The biggest challenge when animating the virtual camera is to find a trajectory that introduces parallax while minimizing these “stretchy” artifacts.

The parts of the mesh with large depth differences become more visible (red visualization) once the camera is away from the “face-on” view. In these areas, the photo appears to be stretched, which we call “stretchy artifacts”.

Because of the wide spectrum in user photos and their corresponding 3D reconstructions, it is not possible to share one trajectory across all animations. Instead, we define a loss function that captures how much of the stretchiness can be seen in the final animation, which allows us to optimize the camera parameters for each unique photo. Rather than counting the total number of pixels identified as artifacts, the loss function triggers more heavily in areas with a greater number of connected artifact pixels, which reflects a viewer’s tendency to more easily notice artifacts in these connected areas.

We utilize padded segmentation masks from a human pose network to divide the image into three different regions: head, body and background. The loss function is normalized inside each region before computing the final loss as a weighted sum of the normalized losses. Ideally the generated output video is free from artifacts but in practice, this is rare. Weighting the regions differently biases the optimization process to pick trajectories that prefer artifacts in the background regions, rather than those artifacts near the image subject.

During the camera trajectory optimization, the goal is to select a path for the camera with the least amount of noticeable artifacts. In these preview images, artifacts in the output are colored red while the green and blue overlay visualizes the different body regions.

Framing the Scene
Generally, the reprojected 3D scene does not neatly fit into a rectangle with portrait orientation, so it was also necessary to frame the output with the correct right aspect ratio while still retaining the key parts of the input image. To accomplish this, we use a deep neural network that predicts per-pixel saliency of the full image. When framing the virtual camera in 3D, the model identifies and captures as many salient regions as possible while ensuring that the rendered mesh fully occupies every output video frame. This sometimes requires the model to shrink the camera’s field of view.

Heatmap of the predicted per-pixel saliency. We want the creation to include as much of the salient regions as possible when framing the virtual camera.

Conclusion
Through Cinematic photos, we implemented a system of algorithms – with each ML model evaluated for fairness – that work together to allow users to relive their memories in a new way, and we are excited about future research and feature improvements. Now that you know how they are created, keep an eye open for automatically created Cinematic photos that may appear in your recent memories within the Google Photos app!

Acknowledgments
Cinematic Photos is the result of a collaboration between Google Research and Google Photos teams. Key contributors also include: Andre Le, Brian Curless, Cassidy Curtis, Ce Liu‎, Chun-po Wang, Daniel Jenstad, David Salesin, Dominik Kaeser, Gina Reynolds, Hao Xu, Huiwen Chang, Huizhong Chen‎, Jamie Aspinall, Janne Kontkanen, Matthew DuVall, Michael Kucera, Michael Milne, Mike Krainin, Mike Liu, Navin Sarma, Orly Liba, Peter Hedman, Rocky Cai‎, Ruirui Jiang‎, Steven Hickson, Tracy Gu, Tyler Zhu, Varun Jampani, Yuan Hao, Zhongli Ding.