Categories
Misc

Severe underfitting CNN models

I am building an image classifier of sorts that takes an image of a speedometer and “reads” the value. I have a collection of about 4000 images all labeled with GPS velocity values. I read in the images and create the X and Y training and validation sets. However, the model doesn’t learn at all. I even tried pre-built models like Resnet50 that tensorflow has and Xception. Both of which gave similar if not identical outputs where the loss and accuracy were constant. When I added regularization it made it much worse where the accuracy was still fixed close to zero and the loss sky rocketed over 1,000,000. I realize that there is no “silver bullet” when tuning a neural network so all suggestions are welcome

“`python

##########################import dependencies

import imp import tensorflow as tf import matplotlib.pyplot as plt import numpy as np import os import sys import sklearn as sk import cv2 import pandas as pd import scipy from scipy.signal import fftconvolve from tokenize import endpats from glob import glob from os.path import join, basename from tensorflow.keras import layers, models, datasets from tensorflow import keras from keras import regularizers from sklearn.model_selection import train_test_split

#####################Create different models to test User defined model that can be changed to experiment with different structures input_shape is important. Make sure you either resize all images to match or, if your images are too large and would lose a lot of data, you can change the input shape here to either match you images or you can change the img.resize() function in the load_data functions. If you choose to keep your image size, you will need to remove the resize() functions from the load data functions and you will have to set the width and height to match your images The third dimension specifies the number of channels in your image. If you want to load color images into models that have the channels set to 1 (e.g create_model()), you will have to change the 1 to a 3.

def create_model(): model = keras.models.Sequential() model.add(layers.Conv2D(50, (3, 3), activation=’relu’, input_shape=(90, 160, 1))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(25, (3, 3), activation=’relu’)) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(10, (3, 3), activation=’relu’)) model.add(layers.Flatten()) model.add(layers.Dense(20, activation=’relu’)) model.add(layers.Dropout(0.1)) model.add(layers.Dense(10, activation=’relu’)) model.add(layers.Dense(1, activation=’softmax’)) model.compile(optimizer=’adam’, loss=”categorical_crossentropy”, metrics=[‘accuracy’]) return model

resnet architecture

def resnet(): base_model = keras.applications.resnet50.ResNet50( weights=’imagenet’, include_top=False, input_shape=[90, 160, 3]) avg = keras.layers.GlobalAveragePooling2D()(base_model.output) output = keras.layers.Dense(1, activation=’softmax’)(avg) model = keras.Model(inputs=base_model.input, outputs=output)

for layer in base_model.layers: layer.trainable = False optimizer = keras.optimizers.SGD() model.compile(loss="MSE", optimizer=optimizer, metrics=["accuracy"]) return model 

not exact but based off of alexNet architecture

def alexNet(): model = keras.models.Sequential() model.add(layers.Conv2D(96, (11, 11), activation=’relu’, padding=’valid’, input_shape=(90, 160, 1))) model.add(layers.MaxPooling2D((3, 3), strides=2, padding=’valid’)) model.add(layers.Conv2D(128, (5, 5), activation=’relu’, padding=’same’)) model.add(layers.MaxPooling2D((3, 3), strides=2, padding=’valid’)) model.add(layers.Conv2D(128, (3, 3), activation=’relu’, padding=’same’, strides=1)) model.add(layers.Conv2D(128, (3, 3), activation=’relu’, padding=’same’, strides=1)) model.add(layers.Conv2D(128, (3, 3), activation=’relu’, padding=’same’, strides=1)) model.add(layers.Flatten()) model.add(layers.Dense(2048, activation=’relu’)) model.add(layers.Dense(1024, activation=’relu’)) model.add(layers.Dense(1, activation=’softmax’)) model.compile(optimizer=’adam’, loss=”MSE”, metrics=[‘accuracy’]) return model

lenet5 architecture

def lenet5(): model = keras.models.Sequential([ keras.layers.Conv2D(6, 5, activation=’tanh’, padding=”same”, input_shape=[90, 160, 1]), keras.layers.MaxPooling2D(2, strides=2), keras.layers.Conv2D(16, 5, activation=’tanh’, padding=”same”), keras.layers.MaxPooling2D(2, strides=2), keras.layers.Conv2D(120, 5, activation=’tanh’, padding=”same”), keras.layers.Flatten(), keras.layers.Dense(84, activation=’tanh’), keras.layers.Dense(1, activation=’softmax’), ]) optimizer = keras.optimizers.SGD() model.compile(loss=”MSE”, optimizer=optimizer, metrics=[“accuracy”]) return model

xception architechture

def xception(): base_model = tf.keras.applications.xception.Xception( include_top=False, weights=’imagenet’, input_shape=[90, 160, 3]) avg = keras.layers.GlobalAveragePooling2D()(base_model.output) output = keras.layers.Dense(1, activation=’softmax’)(avg) model = keras.Model(inputs=base_model.input, outputs=output) optimizer = keras.optimizers.SGD() model.compile(loss=”MSE”, optimizer=optimizer, metrics=[“accuracy”]) return model

##########################Filtering Functions Input: single image array Output: Single image array with detected edges

def edges_single_img(img): # define the vertical filter vertical_filter = [[-1, -2, -1], [0, 0, 0], [1, 2, 1]]

# define the horizontal filter horizontal_filter = [[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]] n, m, d = img.shape edges_img = img.copy() # loop over all pixels in the image for row in range(3, n-2): for col in range(3, m-2): # create little local 3x3 box local_pixels = img[row-1:row+2, col-1:col+2, 0] # apply the vertical filter vertical_transformed_pixels = vertical_filter*local_pixels # remap the vertical score vertical_score = vertical_transformed_pixels.sum()/4 # apply the horizontal filter horizontal_transformed_pixels = horizontal_filter*local_pixels # remap the horizontal score horizontal_score = horizontal_transformed_pixels.sum()/4 # combine the horizontal and vertical scores into a total edge score edge_score = (vertical_score**2 + horizontal_score**2)**.5 # insert this edge score into the edges image edges_img[row, col] = [edge_score]*3 # remap the values in the 0-1 range in case they went out of bounds edges_img = edges_img/edges_img.max() return edges_img 

##################untested function Input: Multi-dimensional array of images Output: Multi-dimensional array with detected edges

def edges_array(img_array): # define the vertical filter vertical_filter = [[-1, -2, -1], [0, 0, 0], [1, 2, 1]]

# define the horizontal filter horizontal_filter = [[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]] w = len(img_array[0, :, 0]) p = len(img_array[0, 0, :]) l = len(img_array[:, 0, 0]) edge_img_array = np.zeros((l, w, p)) for i in range(len(img_array[:, 0, 0])): img = img_array[i, :, :] n, m, d = img.shape edges_img = img.copy() # loop over all pixels in the image for row in range(3, n-2): for col in range(3, m-2): # create little local 3x3 box local_pixels = img[row-1:row+2, col-1:col+2, 0] # apply the vertical filter vertical_transformed_pixels = vertical_filter*local_pixels # remap the vertical score vertical_score = vertical_transformed_pixels.sum()/4 # apply the horizontal filter horizontal_transformed_pixels = horizontal_filter*local_pixels # remap the horizontal score horizontal_score = horizontal_transformed_pixels.sum()/4 # combine the horizontal and vertical scores into a total edge score edge_score = (vertical_score**2 + horizontal_score**2)**.5 # insert this edge score into the edges image edges_img[row, col] = [edge_score]*3 # remap the values in the 0-1 range in case they went out of bounds edges_img = edges_img/edges_img.max() edge_img_array[i, :, :] = edges_img return edge_img_array 

#############User input functions

def get_mode(): print(“Operating modes:n 1. Run all modelsn 2. Run single modeln “) mode = input(“Enter Operating mode: “) mode = int(mode) if (mode > 2 or mode < 1): print(“Error 1: Invalid Operating mode! Please enter a valid operating mode”) return mode

def model_choice(mode): if (mode == 1): print(“Loading all models…”) model_num = 0 return model_num elif (mode == 2): print(“Available Models:n 1. Custom Model n 2. lenet5 n 3. AlexNet Variant n 4. ResNet50 n 5. Xception”) model_num = input(“Enter the model that you would like to test: “) model_num = int(model_num) if (int(model_num) > 5 or int(model_num) < 1): print(“Error 2: Invalid model! Please enter a valid model”) else: return model_num

else: print("Error 3: Invalid mode passed") 

############Load and parse data (preprocessing) Input: these functions take a file path as their input. This is the path to the directory with ALL images Important note: image names must follow the naming convention of “img#####_##.##.jpeg” the first 5 numbers are for image index the next 2 are the tens and ones place of velocity and the final two are the tenths and hundreths of the velocity if you wish to use a different naming convention, please edit the “create_labels” function to use you naming style Output: preprocessed, labeled data creates a list of all file names in the direcotry and sorts them

def createlabels(path_to_imgs): files = glob(join(path_to_imgs, ‘*’, ‘*.jpg’), recursive=True) files.sort(key=basename) labels = [] for x in files: start = x.find(‘_’) end = x.find(‘.j’) labels.append(float(x[start+1:end])) labels = np.asarray(labels, dtype=float) return labels, files

loads color images and coverts to grayscale In loaddata*() functions, you can change the w and h variables to match your images if you do not want to resize, or you can change to whatever size works best for your images

def load_data_gray(path_to_imgs): w = 16 * 10 h = 9 * 10 labels, files = create_labels(path_to_imgs) num_imgs = len(files) temp_array = np.zeros((num_imgs, h, w)) for idx, path in enumerate(files): img = cv2.imread(path) img = cv2.resize(img, (w, h)) img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) temp_array[idx, :, :] = img temp_array = temp_array.reshape(num_imgs, 90, 160, -1) return temp_array, labels

assumes library of images is already in grayscale

def load_data_asGray(path_to_imgs): w = 16 * 10 h = 9 * 10 labels, files = create_labels(path_to_imgs) num_imgs = len(files) temp_array = np.zeros((num_imgs, h, w)) for idx, path in enumerate(files): img = cv2.imread(path) img = cv2.resize(img, (w, h)) temp_array[idx, :, :] = img temp_array = temp_array.reshape(num_imgs, 90, 160, -1) return temp_array, labels

loads color images

def load_data_color(path_to_imgs): w = 16 * 10 h = 9 * 10 labels, files = create_labels(path_to_imgs) num_imgs = len(files) temp_array = np.zeros((num_imgs, h, w, 3)) for idx, path in enumerate(files): img = cv2.imread(path) img = cv2.resize(img, (w, h)) temp_array[idx, :, :, :] = img temp_array = temp_array.reshape(num_imgs, 90, 160, 3) return temp_array, labels

def custom_train_test_split(img_arr, labels): X_train, X_test, y_train, y_test = train_test_split( img_arr, labels, test_size=0.2, random_state=42) return X_train, X_test, y_train, y_test

#####################Training functions Input: For some models that only use gray scale images, we input the trainingImages and trainingLabels so that data only as to loaded once For models that require color images (resnet) we do not input the trainingImages, and trainingLabels those are loaded inside the training function Output: an array of training data

def train_Xception_model(path_to_imgs): model = xception() print(model.summary()) trainingImages, trainingLabels = load_data_color(path_to_imgs) history = model.fit(trainingImages, trainingLabels, epochs=10) return history.history

def train_custom_model(trainingImages, trainingLabels): model = create_model() print(model.summary()) X_train, X_test, y_train, y_test = custom_train_test_split(trainingImages, trainingLabels) history = model.fit(X_train, y_train, epochs=100, batch_size=100, validation_data=(X_test, y_test)) return history.history

def train_resnet_model(path_to_imgs): model = resnet() trainingImages, trainingLabels = load_data_color(path_to_imgs) print(model.summary()) history = model.fit(trainingImages, trainingLabels, epochs=10) return history.history

def train_alexnet_model(trainingImages, trainingLabels): model = alexNet() print(model.summary()) history = model.fit(trainingImages, trainingLabels, epochs=10) return history.history

def train_lenet5_model(trainingImages, trainingLabels): model = lenet5() print(model.summary()) history = model.fit(trainingImages, trainingLabels, epochs=10) return history.history

Input: mode to operate in, model_num to determine which model train if mode != 1 Output: Array(s) of training data

def training(mode, model_num, path_to_imgs): trainingImages, trainingLabels = load_data_gray(path_to_imgs) if (mode == 1): history_custom = train_custom_model(trainingImages, trainingLabels) history_alex = train_alexnet_model(trainingImages, trainingLabels) history_lenet5 = train_lenet5_model(trainingImages, trainingLabels) history_resnet = train_resnet_model(path_to_imgs) history_xception = train_Xception_model(path_to_imgs) # df1 = pd.DataFrame(history_custom.history) # df1.to_excel(“custom_model_training.xlsx”) # df2 = pd.DataFrame(history_alex.history) # df2.to_excel(“alexnet_model_training.xlsx”) # df3 = pd.DataFrame(history_lenet5.history) # df3.to_excel(“lenet5_model_training.xlsx”) # df4 = pd.DataFrame(history_resnet.history) # df4.to_excel(“resnet_model_training.xlsx”) # df5 = pd.DataFrame(history_xception.history) # df5.to_excel(“xception_model_training.xlsx”) return history_custom, history_lenet5, history_alex, history_resnet, history_xception elif (mode == 2): if (model_num == 1): history = train_custom_model(trainingImages, trainingLabels) # df = pd.DataFrame(history.history) # df.to_excel(“custom_model_training.xlsx”) return history elif (model_num == 2): history = train_lenet5_model(trainingImages, trainingLabels) # df = pd.DataFrame(history.history) # df.to_excel(“lenet5_model_training.xlsx”) return history elif (model_num == 3): history = train_alexnet_model(trainingImages, trainingLabels) # df = pd.DataFrame(history.history) # df.to_excel(“alexnet_model_training.xlsx”) return history elif (model_num == 4): history = train_resnet_model(path_to_imgs) # df = pd.DataFrame(history.history) # df.to_excel(“resnet_model_training.xlsx”) return history elif (model_num == 5): history = train_Xception_model() # df = pd.DataFrame(history.history) # df.to_excel(“xception_model_training.xlsx”) else: print(“Invalid model number! Please choose againn”) else: print(“Invalid mode! Please choose againn”)

#################Adjust Tensorflow settings to run on GPU Input: Boolean to tell Tensorflow to use GPU acceleration or to strictly use CPU Output: Void

def set_tf_settings(use_GPU): if (use_GPU == False): os.environ[“CUDA_VISIBLE_DEVICES”] = “-1” else: physical_devices = tf.config.list_physical_devices(‘GPU’) try: tf.config.experimental.set_memory_growth(physical_devices[0], True) print(tf.config.experimental.get_device_details(physical_devices[0])) print(tf.config.experimental.get_memory_usage)

 except: # Invalid device or cannot modify virtual devices once initialized. pass print(physical_devices[0]) 

#######################Export Model Functions Input: Tensorflow model instance and the name that you want to give it Output: A JSON file for model architecture and an h5py file for weight values both can be loaded back into a file using the inverse function json_file = open(‘model.json’, ‘r’) loaded_model_json = json_file.read() json_file.close() loaded_model = model_from_json(loaded_model_json) loaded_model.load_weights(“model.h5”)

def export_model(model, model_name): model_json = model.to_json() with open(model_name + “.json”, “w”) as json_file: json_file.write(model_json) print(“Model saved!”) model.save_weights(model_name+”_weights.h5″) print(“weights saved!”)

PATH = ‘/home/[name]/Git/speedometer-data/imgs/’

def main(): mode = get_mode() model_num = model_choice(mode) if (mode == 2): history = training(mode, model_num, PATH) print(history.history) elif (mode == 1): history_custom, history_lenet5, history_alex, history_resnet, history_xception = training( mode, model_num) print(history_custom) print(history_lenet5) print(history_alex) print(history_resnet) print(history_xception) else: print(“Error 4: Invalid Mode Selected!”)

main()

“`

submitted by /u/NameError-undefined
[visit reddit] [comments]

Categories
Misc

NVIDIA Research: Appearance-Driven Automatic 3D Model Simplification

NVIDIA will be presenting a new paper introducing a new method for generating level-of-detail of complex models, taking both geometry and surface appearance into account.

NVIDIA will be presenting a new paper titled “Appearance-Driven Automatic 3D Model Simplification” at Eurographics Symposium on Rendering 2021 (EGSR), June 29-July 2, introducing a new method for generating level-of-detail of complex models, taking both geometry and surface appearance into account.

Level-of-detail for aggregate geometry, where we represent each leaf as a semi-transparent textured quad. The geometrical complexity is greatly reduced, to just 0.4% of the original triangle count, with little visual impact.

Level-of-detail has long been used in computer games as a means of improving performance and reducing aliasing artifacts that may occur due to shading detail or small geometric features. Traditional approaches to level of detail include mesh simplification, normal map baking, and shading/BSDF prefiltering. Each problem is typically tackled in isolation.

We approach level-of-detail entirely in image space, with our optimization objective being “does a simplified model look like the reference when rendered from a certain distance?” (i.e., we use a standard image loss). This perspective is not entirely new, but recent advances in differentiable rendering have transformed it from a theoretical exercise to something highly practical, with excellent performance. We propose an efficient inverse rendering method and system that can be used to simultaneously optimize shape and materials to generate level-of-detail models, or clean up the result of automatic simplification tools.

Approaching model simplification through inverse rendering lets us unify previous methods into a single system, optimizing for a single loss. This is important, because the system can negotiate which rendering term is best suited to represent a detail. An example is shown in the image below, where we create a simplified version of the Ewer statue. By using normal mapping in the inverse rendering setup, the system automatically determines which features are best represented by geometry, and which can be represented by the normal map.

Normal Map
Our (7k tris)
Reference (300k tris)

We show that our method is applicable to a wide range of applications including level-of-detail, normal and displacement map baking, shape and appearance prefiltering and simplification of aggregate geometry, all while supporting animated geometry. We can additionally convert between surface representations, e.g. convert an implicit surface to a mesh, different material representations and different renderers.

Refer to the paper and supplemental material for full results. Our source code is publicly available at GitHub.

Learn more: Check out the project website.

Categories
Misc

NVIDIA Research: Learning and Rendering Dynamic Global Illumination with One Tiny Neural Network in Real-Time

Today, NVIDIA is releasing a SIGGRAPH 2021 technical paper, “Real-time Neural Radiance Caching for Path Tracing” that introduces another leap forward in real-time global illumination: Neural Radiance Caching. Global illumination, that is, illumination due to light bouncing around in a scene, is essential for rich, realistic visuals. This is a challenging task, even in cinematic … Continued

Today, NVIDIA is releasing a SIGGRAPH 2021 technical paper, “Real-time Neural Radiance Caching for Path Tracing” that introduces another leap forward in real-time global illumination: Neural Radiance Caching.

Global illumination, that is, illumination due to light bouncing around in a scene, is essential for rich, realistic visuals. This is a challenging task, even in cinematic rendering, because it is difficult to find all the paths of light that contribute meaningfully to an image. Solving this problem through brute force requires hundreds, sometimes thousands of paths per pixel, but this is far too expensive for real-time rendering.

Direct illumination alone (left) lacks indirect reflections. Global illumination (right) adds indirect reflections, resulting in refined image detail and realism. The images were rendered offline.

Before NVIDIA RTX introduced real-time ray tracing to games, global illumination in games was largely static. Since then, technologies such as RTXGI bring dynamic global illumination to life. They overcome the limited realism of pre-computing “baked” lighting in dynamic worlds and simplify an otherwise tedious lighting design process.

Neural Radiance Caching combines RTX’s neural network acceleration hardware (NVIDIA TensorCores) and ray tracing hardware (NVIDIA RTCores) to create a system capable of fully-dynamic global illumination that works with all kinds of materials, be they diffuse, glossy, or volumetric. It handles fine-scale textures such as albedo, roughness, or bump maps, and scales to large, outdoor environments neither requiring auxiliary data structures nor scene parameterizations.

Combined with NVIDIA’s state-of-the-art direct lighting algorithm, ReSTIR, Neural Radiance Caching can improve rendering efficiency of global illumination by up to a factor of 100—two orders of magnitude.

NVIDIA’s combination of ReSTIR and Neural Radiance Caching (middle) exhibits less noise than path tracing (left). The right image shows an offline rendered ground truth.

NVIDIA’s combination of ReSTIR and Neural Radiance Caching (middle) exhibits less noise than path tracing (left). The right image shows an offline rendered ground truth.

At the heart of the technology is a single tiny neural network that runs up to 9x faster than TensorFlow v2.5.0. Its speed makes it possible to train the network live during gameplay in order to keep up with arbitrary dynamic content. On an NVIDIA RTX 3090 graphics card, Neural Radiance Caching can provide over 1 billion global illumination queries per second.

NVIDIA’s fully fused neural networks outperforming TensorFlow v2.5.0 for a 64 neuron wide (solid line) and 128 neuron wide (dashed line) multi-layer perceptrons on an NVIDIA RTX 3090.

NVIDIA’s fully fused neural networks outperforming TensorFlow v2.5.0 for a 64 neuron wide (solid line) and 128 neuron wide (dashed line) multi-layer perceptrons on an NVIDIA RTX 3090.

Training is often considered an offline process with only inference occurring at runtime. In contrast, Neural Radiance Caching performs both training and inference at runtime, showing that real-time training of neural networks is practical.

This paper paves the way for using dynamic neural networks in various other areas of computer graphics and possibly other real-time fields, such as wireless communication and reinforcement learning.

“Our work represents a paradigm shift from complex, memory-intensive auxiliary data structures to generic, compute-intensive neural representations” says Thomas Müller, one of the paper’s authors. “Computation is significantly cheaper than memory transfers, which is why real-time trained neural networks make sense despite their massive number of operations.”

We are excited about future applications enabled by tiny real-time-trained neural networks and look forward to further research of real-time machine learning in computer graphics and beyond. To help researchers and developers adopt the technology, NVIDIA releases the CUDA source code of their tiny neural networks.

Learn more: Check out the project website.

Categories
Misc

NVIDIA Research: An Analytic BRDF for Materials with Spherical Lambertian Scatterers

Researchers at NVIDIA presented a new paper “An Analytic BRDF for Materials with Spherical Lambertian Scatterers” at Eurographics Symposium on Rendering 2021 (EGSR), June 29-July 2, introducing a new BRDF for dusty/diffuse surfaces.  Most rough diffuse BRDFs such as Oren-Nayar are based on a random height-field microsurface, which limits the range of roughnesses that are … Continued

Researchers at NVIDIA presented a new paper “An Analytic BRDF for Materials with Spherical Lambertian Scatterers” at Eurographics Symposium on Rendering 2021 (EGSR), June 29-July 2, introducing a new BRDF for dusty/diffuse surfaces. 

Our new Lambert-sphere BRDF (right) accurately and efficiently models the reflectance of a porous microstructure consisting of Lambertian spherical particles.

Most rough diffuse BRDFs such as Oren-Nayar are based on a random height-field microsurface, which limits the range of roughnesses that are physically plausible (a height field can only get so spiky before it becomes implausible). To avoid this limitation and extend the usable range of rough diffuse BDRFs, we take a volumetric approach to surface microstructure to derive BRDF for very rough diffuse materials. It is simple and intuitive to control with a single diffuse color parameter and it produces more saturated colors and backscattering than other models.

The intersection of volume and surface representations in computer graphics is seeing a rapid growth with new techniques such as NeRF. Our ability to seamlessly interchange between surface and volume descriptions of the same scene with no noticeable appearance change is an important tool for efficiently authoring and rendering complex scenes. A BRDF is one such form of representation interchange. In this case, we derive the BRDF that simulates a porous volumetric microstructure consisting of Lambertian spherical particles (pictured above). In some sense this results in an infinitely rough version of the Oren-Nayar BRDF. The resulting BRDF can be used to render diffuse porous materials such as foam up to 100 times more efficiently than using stochastic random walk methods.

Our new Lambert-sphere BRDF produces brighter backscattering and more saturated rim lighting than other diffuse BRDFs.

We call our BRDF the Lambert-sphere (LS) BRDF. We present a highly accurate version that is only 30% slower to evaluate than Oren-Nayar, and a faster approximate version for real-time applications. We also include importance sampling for the Lambertian-sphere phase function for use rendering large diffusive smoke and debris particles. Below we compare our BRDF to Lambertian, Oren-Nayar and Chandrasekhar’s BRDF that consists of a thick volumetric layer of mirror spheres in an absorbing matrix.

Learn more: Check out the project website.

Categories
Misc

NVIDIA Research: An Unbiased Ray-Marching Transmittance Estimator

NVIDIA researchers will present their paper “An Unbiased Ray-Marching Transmittance Estimator” at SIGGRAPH 2021, August 9-13, showing a new way to compute visibility in scenes with complex volumetric effects.

NVIDIA researchers will present their paper “An Unbiased Ray-Marching Transmittance Estimator” at SIGGRAPH 2021, August 9-13, showing a new way to compute visibility in scenes with complex volumetric effects.

We present new methods for reducing a common source of noise in scenes with volumetric effects.

When Monte Carlo sampling is used to compute lighting effects such as soft shadows and global illumination, shadow rays are used to query the visibility between lights and surfaces.  In many scenes, visibility is a simple binary answer that is efficiently queried using the RTX RT Cores.  However, for volumetric effects such as clouds, smoke and explosions, visibility is a fractional quantity ranging from 0.0 to 1.0, and computing this quantity efficiently and accurately between any two points in a scene is an essential part of photorealistic real-time rendering.  Visibility that accounts for volumetric effects is also called transmittance.

We present a new algorithm, which we call unbiased ray marching, for evaluating this transmittance in general scenes. Our key result is a new statistically-unbiased Monte Carlo method that randomly queries the volume densities along a ray and computes a visibility estimate from these values.  For high-performance rendering, the goal is to compute an accurate low-noise estimate while minimizing the number of times the volume data needs to be  accessed.  

The research paper provides a new efficiency analysis of the transmittance estimation problem. This yields a number of key insights about when and how various transmittance estimators achieve the optimal balance of low noise and low cost.  Building on these insights, and leveraging previous work from graphics, physics and statistics, we derive a new transmittance estimator that is universally optimal relative to 50 years of prior art, and often ten times more efficient than earlier methods.  Technically, the end result is based on a power-series expansion of the exponential function that appears in the transmittance equation, and 90% of the time our new estimator simply evaluates the first term in this expansion. This first term corresponds to the traditional biased ray-marching algorithm. The subsequent terms of the expansion, which our estimator evaluates 10% of the time, correct for the bias.  This key insight allows us to benefit from the efficiency of ray marching without being plagued by its bias – occasional light-leaking artifacts seen as, for example, overly bright clouds.

The new method helps to reduce the noise of shadows, such as the one cast on the floor by the smoke plume in the above figure where we see a dramatic reduction in noise at the same render time.

Putting a new twist on an old method, unbiased ray marching virtually eliminates a common source of noise in path-traced renderings of scenes with rich volumetric effects.

Learn more: Check out the project website.

Categories
Misc

ICYMI: NVIDIA Jetson for Robot Operating System

In this month’s technical digest we’re highlighting this powerful capability and offer a collection of resources to help ROS users familiarize with the power of Jetson, ISAAC SDK, ISAAC Sim and a success story from NVIDIA Inception Member, Aerobotics.

While you are likely already familiar with Jetson as the NVIDIA AI platform for edge AI, you might be surprised to learn that the power of Jetson isn’t limited to solutions built entirely on Jetson platforms. Jetpack on Jetson and Isaac SDK, our robotics platform, adds AI capabilities to existing autonomous machines and bridges dependencies for diverse hardware and software solutions.

In this month’s technical digest, we’re highlighting this powerful capability and offering a collection of resources to help Robotic Operating System (ROS) users get familiar with the power of Jetson, Isaac SDK, Isaac Sim, and a success story on a startup in the NVIDIA Inception program.

Jetson and ROS tutorials

Accelerating AI Modules for ROS and ROS 2 on NVIDIA Jetson Platform
A showcase of support for ROS and ROS 2 on NVIDIA Jetson developer kits.
Read now >

Building Robotics Applications Using ROS and NVIDIA Isaac SDK
Learn how to use a ROS application stack with NVIDIA Isaac and ISAAC Sim.
Read now >  

Implementing Robotics Applications with ROS 2 and AI on the NVIDIA Jetson Platform
Learn how to perform classification, object detection and pose estimation with ROS 2 on Jetson.
Read now >

Top 5 Jetson resources

Meet Jetson, the platform for AI at the edge
NVIDIA Jetson is used by professional developers to create breakthrough AI products across all industries, and by students and enthusiasts for hands-on AI learning and making amazing projects.
Meet Jetson >

Getting started with Jetson Developer Kits
Jetson developer kits are used by professionals to develop and test software for products based on Jetson modules, and by students and enthusiasts for projects and learning. Each developer kit includes a non-production specification Jetson module attached to a reference carrier board with standard hardware interfaces for flexible development and rapid prototyping.
Learn more >

Free 8-hour class on getting started with AI on Jetson Nano
In this course, you’ll use Jupyter iPython notebooks on your own Jetson Nano to build a deep learning classification project with computer vision models. Upon completion, you receive a certificate.
Start learning hands-on today >

Jetson and Embedded Systems developer forums
The Jetson and Embedded Systems community is active, vibrant, and continually monitored by the Jetson Product and Engineering teams.
Meet the community >

Jetson community projects
Explore and learn from Jetson projects created by us and our community.
Explore community projects >

Recommended hardware

Jetson Nano 2GB
Most affordable
At 457 GFLOPs for $59, this developer kit is the ultimate starter AI computer.
Learn more >
Jetson Xavier NX
Great for multiple use cases
Run modern neural networks and advanced AI applications within the footprint of a credit card.
Learn more >
Partner solutions
Production-ready now
Our ecosystem partners can support subsystem designs all the way up to fully realized autonomous machines with their associated requirements.
Learn more >

Inception spotlight

Bilberry: Using Jetson for Computer Vision for Crop Surveying
In 2016, the former dormmates at École Nationale Supérieure d’Arts et Métiers in Paris, founded Bilberry. The startup developed a solution to recognize weeds powered by the NVIDIA Jetson edge AI platform for precision application of herbicides at corn and wheat farms, offering as much as a 92% reduction in herbicide usage.
Learn more >

Do you have a startup? Join NVIDIA Inception’s global network of over 7,500 AI and data science startups.

Categories
Misc

NVIDIA Announces Instant AI Infrastructure for Enterprises

Built with Enterprise IT Partners and Deployed First at Equinix, NVIDIA AI LaunchPad Includes End-to-End NVIDIA Hardware and Software Stack to Accelerate AI from Hybrid Cloud to EdgeSANTA …

Categories
Misc

NVIDIA Fleet Command Scales Edge AI Services for Enterprises

SaaS Platform Helps Leading Device Makers, Factories, Warehouses and Retailers Deploy AI Products and ServicesSANTA CLARA, Calif., June 22, 2021 (GLOBE NEWSWIRE) — NVIDIA today announced …

Categories
Misc

Tesla Unveils Top AV Training Supercomputer Powered by NVIDIA A100 GPUs

Tackling one of the largest computing challenges of this lifetime requires larger than life computing. At CVPR this week, Andrej Karpathy, senior director of AI at Tesla, unveiled the in-house supercomputer the automaker is using to train deep neural networks for Autopilot and self-driving capabilities. The cluster uses 720 nodes of 8x NVIDIA A100 Tensor Read article >

The post Tesla Unveils Top AV Training Supercomputer Powered by NVIDIA A100 GPUs appeared first on The Official NVIDIA Blog.

Categories
Misc

Can someone help me fix this error?

Traceback (most recent call last):

File “c:UsersLandenOneDriveDesktopSign Language DetectorRealTimeObjectDetectionTensorflowscriptsgenerate_tfrecord.py”, line 62, in <module> label_map_dict = label_map_util.get_label_map_dict(label_map) File

“C:UsersLandenAppDataLocalProgramsPythonPython39libsite-packagesobject_detectionutilslabel_map_util.py”, line 164, in get_label_map_dict label_map = load_labelmap(label_map_path) File

“C:UsersLandenAppDataLocalProgramsPythonPython39libsite-packagesobject_detectionutilslabel_map_util.py”, line 133, in load_labelmap label_map_string = fid.read() File “C:UsersLandenAppDataLocalProgramsPythonPython39libsite-packagestensorflowpythonlibiofile_io.py”, line 117, in read self._preread_check() File

“C:UsersLandenAppDataLocalProgramsPythonPython39libsite-packagestensorflowpythonlibiofile_io.py”, line 79, in _preread_check self._read_buf = _pywrap_file_io.BufferedInputStream( TypeError: __init__(): incompatible constructor arguments.

The following argument types are supported:

  1. tensorflow.python.lib.io._pywrap_file_io.BufferedInputStream(filename: str, buffer_size: int, token: tensorflow.python.lib.io._pywrap_file_io.TransactionToken = None)

    Invoked with: item { name: “Hello” id: 1 } item { name: “ILoveYou” id: 2 } item { name: “No” id: 3 } item { name: “Thanks” id: 4 } item { name: “Yes” id: 5 } , 524288

submitted by /u/Spud783
[visit reddit] [comments]