Categories
Misc

As Fast as One Can Gogh: Turn Sketches Into Stunning Landscapes with NVIDIA Canvas

Turning doodles into stunning landscapes — there’s an app for that. The NVIDIA Canvas app, now available as a free beta, brings the real-time painting tool GauGAN to anyone with an NVIDIA RTX GPU. Developed by the NVIDIA Research team, GauGAN has wowed creative communities at trade shows around the world by using deep learning Read article >

The post As Fast as One Can Gogh: Turn Sketches Into Stunning Landscapes with NVIDIA Canvas appeared first on The Official NVIDIA Blog.

Categories
Misc

Into to Deep Learning project in TensorFlow 2.x and Python – free course from udemy

Into to Deep Learning project in TensorFlow 2.x and Python - free course from udemy submitted by /u/Ordinary_Craft
[visit reddit] [comments]
Categories
Misc

Concating 3 multivariate sequences as an input to 1 model?

I’ve been trying to figure it out for about a week now but I keep getting ‘Data cardinality is ambiguous’. I’m creating a sequential model for each multivariate sequence, then concating the .output from each of those models as the input to a Keras model. I’m also feeding the inputs in as a list of each .input from each model.

Even when I make the last layer of each sequence’s model a dense layer with the same amount of units, the cardinality error still complain’s about concating different sequence lengths.

Any ideas or working code appreciated

submitted by /u/Techguy13
[visit reddit] [comments]

Categories
Misc

Metropolis Spotlight: Nota Is Transforming Traffic Management Systems With AI

Nota, an NVIDIA Metropolis partner, is using AI to make roadways safer and more efficient with NVIDIA’s edge GPUs and deep learning SDKs.

Nota, an NVIDIA Metropolis partner, is using AI to make roadways safer and more efficient with NVIDIA’s edge GPUs and deep learning SDKs.

Nota developed a real-time traffic control solution that uses image recognition technology to identify traffic volume and queues, analyze congestion, and optimize traffic signal controls at intersections. 

Using the DeepStream SDK off-the-shelf features, such as line crossing and setting a region of interest, Nota significantly improved how accurately it could examine traffic situations. Nota deployed the solution at a busy intersection in Pyeongtaek, South Korea to analyze traffic flow and control traffic lights in real-time. Nota was able to improve the traffic flow by 25% during regular hours, and by more than 300% during rush hour, saving the city traffic-congestion-related costs and reducing the time spent by drivers stuck in traffic. 

Read more in our solution showcase.

Categories
Misc

Metropolis Spotlight: INEX Is Revolutionizing Toll Road Systems with Real-time Video Processing

INEX Technologies, an NVIDIA Metropolis partner, designs, develops, and manufactures comprehensive hardware and software solutions for license plate recognition and vehicle identification.

INEX Technologies, an NVIDIA Metropolis partner, designs, develops, and manufactures comprehensive hardware and software solutions for license plate recognition and vehicle identification. 

The INEX RoadView solution provides automatic axle counting, vehicle classification, as well as lane zone tracking and triggering using LPR and RoadView cameras. RoadView video-based recognition eliminates the need for costly concrete cutting, in-ground loop maintenance, and axle-counting treadles.

NVIDIA GPUs are used to accelerate the real-time video analysis of the INEX ALPR system, which requires incredibly high accuracy along with high throughput and high frame rates. At the edge, the INEX uses the NVIDIA Jetson Nano and Jetson NX platform and embedded software stack.

Under the hood  

INEX video pipeline is based on the NVIDIA DeepStream SDK, which helps achieve super-optimized throughput, and makes it simpler to integrate complex classification and detection algorithms. INEX further leverages some of the world’s most powerful AI productivity tools by integrating NVIDIA pre-trained models and NVIDIA Transfer Learning Toolkit into their development workflow, reducing development time by a stunning 60%. And by going end-to-end with the full stack of NVIDIA hardware and software and deploying on NVIDIA Jetson edge platform, they reduced hardware and setup costs by 60% and lowered operating and maintenance costs by 50%.

The implications and impact for INEX are significant. Leveraging the NVIDIA platform, they can roll out world-class solutions performing challenging real-time vehicle detection and classification read licenses from all 50 states in the US and have expanded to countries in Europe, the Far East, the Middle East, and Australia. Tolling authorities upgrading to the INEX vehicle classification and ALPR system, can supercharge their toll systems quickly and easily – leveraging the latest AI technology.

Read more in our solution showcase.

Categories
Misc

NVIDIA Research: Learning Modular Scene Representations With Neural Scene Graphs

NVIDIA researchers will present their paper “Neural Scene Graph Rendering” at SIGGRAPH 2021, August 9-13, which introduces a neural scene representation inspired by traditional graphics scene graphs.  Recent advances in neural rendering have pushed the boundaries of photorealistic rendering; take StyleGAN as an example of producing realistic images of fictional people. The next big challenge … Continued

NVIDIA researchers will present their paper “Neural Scene Graph Rendering” at SIGGRAPH 2021, August 9-13, which introduces a neural scene representation inspired by traditional graphics scene graphs. 

Recent advances in neural rendering have pushed the boundaries of photorealistic rendering; take StyleGAN as an example of producing realistic images of fictional people. The next big challenge is bringing these neural techniques into digital content-creation applications, like Maya and Blender. This challenge requires a new generation of neural scene models that feature artistic control and modularity that is comparable to classical 3D meshes and material representations.

“In order to kick-off these developments, we needed to step back a little bit and scale down the scene complexity,” mentions Jonathan Granskog, the first author of the paper.

This is one of the reasons why the images in the paper are reminiscent of early years of computer graphics. However, the artistic control and the granularity of neural elements is closer to what modern applications would require to integrate neural rendering into traditional authoring pipelines. The proposed approach allows organizing learned neural elements into an (animated) scene graph much like in standard authoring tools. 

Three frames from an animation with tangram shapes that gradually
morph from one assembly into another. The twirl deformation is applied to
individual pieces during the transition.

Frames from a 2D sprite animation featuring 16 alpha-masked
textures that are instantiated over a static background image. The prediction
attains most of the texture detail. Artifacts appear primarily where two
“ground” tiles meet due to slightly softer reproduction of texture edges.

Two diffuse tori playing beach volleyball with a volumetric ball. In
the right-most column, the materials of the ball and tori are swapped.

A neural element may represent, for instance, the geometry of a teapot or the appearance of porcelain. Each such scene element is stored as an abstract, high-dimensional vector with its parameters being learned from images. During the training process, the method also learns how to manipulate and render these abstract vectors. For instance, a vector representing a piece of geometry can be translated, rotated, bent, or twisted using a manipulator. Analogously, material elements can be altered by stretching the texture content, desaturating it, or changing the hue.

Since the optimizable components (vectors, manipulators, and the renderer) are very general, the approach can handle both 2D and 3D scenes without changing the methodology. The artist can compose a scene by organizing the vectors and manipulators into a scene graph. The scene graph is then collapsed into a stream of neural primitives that are translated into an RGB image using a streaming neural renderer, much like a rasterizer would turn a stream of triangles into an image.

The analogy to the traditional scene graphs and rendering pipelines is not coincidental.

“Our goal is to eventually combine neural and classical scene primitives, and bringing the representations closer to each other is the first step on that path,” says Jan Novák, a co-author of the paper.

This will unlock the possibility of extracting scene elements from photographs using AI algorithms, combining them with classical graphics representations, and composing scenes and animations in a controlled manner.

The animations on this page illustrate the potential. The individual neural elements were learned from images of random static scenes. An artist then defined a sequence of scene graphs to produce a fluent animation consisting of the learned elements. While there is still a long way to go to achieve high-quality visuals and scene complexity of modern applications with this approach, the article presents a feasible approach for bringing neural and classical rendering together. Once these fully join forces, real-time photorealistic rendering could experience the next quantum leap.

Learn more: Check out the project website.

Categories
Offsites

Quantum Machine Learning and the Power of Data

Quantum computing has rapidly advanced in both theory and practice in recent years, and with it the hope for the potential impact in real applications. One key area of interest is how quantum computers might affect machine learning. We recently demonstrated experimentally that quantum computers are able to naturally solve certain problems with complex correlations between inputs that can be incredibly hard for traditional, or “classical”, computers. This suggests that learning models made on quantum computers may be dramatically more powerful for select applications, potentially boasting faster computation, better generalization on less data, or both. Hence it is of great interest to understand in what situations such a “quantum advantage” might be achieved.

The idea of quantum advantage is typically phrased in terms of computational advantages. That is, given some task with well defined inputs and outputs, can a quantum computer achieve a more accurate result than a classical machine in a comparable runtime? There are a number of algorithms for which quantum computers are suspected to have overwhelming advantages, such as Shor’s factoring algorithm for factoring products of large primes (relevant to RSA encryption) or the quantum simulation of quantum systems. However, the difficulty of solving a problem, and hence the potential advantage for a quantum computer, can be greatly impacted by the availability of data. As such, understanding when a quantum computer can help in a machine learning task depends not only on the task, but also the data available, and a complete understanding of this must include both.

In “Power of data in quantum machine learning”, published in Nature Communications, we dissect the problem of quantum advantage in machine learning to better understand when it will apply. We show how the complexity of a problem formally changes with the availability of data, and how this sometimes has the power to elevate classical learning models to be competitive with quantum algorithms. We then develop a practical method for screening when there may be a quantum advantage for a chosen set of data embeddings in the context of kernel methods. We use the insights from the screening method and learning bounds to introduce a novel method that projects select aspects of feature maps from a quantum computer back into classical space. This enables us to imbue the quantum approach with additional insights from classical machine learning that shows the best empirical separation in quantum learning advantages to date.

Computational Power of Data
The idea of quantum advantage over a classical computer is often framed in terms of computational complexity classes. Examples such as factoring large numbers and simulating quantum systems are classified as bounded quantum polynomial time (BQP) problems, which are those thought to be handled more easily by quantum computers than by classical systems. Problems easily solved on classical computers are called bounded probabilistic polynomial (BPP) problems.

We show that learning algorithms equipped with data from a quantum process, such as a natural process like fusion or chemical reactions, form a new class of problems (which we call BPP/Samp) that can efficiently perform some tasks that traditional algorithms without data cannot, and is a subclass of the problems efficiently solvable with polynomial sized advice (P/poly). This demonstrates that for some machine learning tasks, understanding the quantum advantage requires examination of available data as well.


Geometric Test for Quantum Learning Advantage

Informed by the results that the potential for advantage changes depending on the availability of data, one may ask how a practitioner can quickly evaluate if their problem may be well suited for a quantum computer. To help with this, we developed a workflow for assessing the potential for advantage within a kernel learning framework. We examined a number of tests, the most powerful and informative of which was a novel geometric test we developed.

In quantum machine learning methods, such as quantum neural networks or quantum kernel methods, a quantum program is often divided into two parts, a quantum embedding of the data (an embedding map for the feature space using a quantum computer), and the evaluation of a function applied to the data embedding. In the context of quantum computing, quantum kernel methods make use of traditional kernel methods, but use the quantum computer to evaluate part or all of the kernel on the quantum embedding, which has a different geometry than a classical embedding. It was conjectured that a quantum advantage might arise from the quantum embedding, which might be much better suited to a particular problem than any accessible classical geometry.

We developed a quick and rigorous test that can be used to quickly compare a particular quantum embedding, kernel, and data set to a range of classical kernels and assess if there is any opportunity for quantum advantage across, e.g., possible label functions such as those used for image recognition tasks. We define a geometric constant g, which quantifies the amount of data that could theoretically close that gap, based on the geometric test. This is an extremely useful technique for deciding, based on data constraints, if a quantum solution is right for the given problem.

Projected Quantum Kernel Approach
One insight revealed by the geometric test, was that existing quantum kernels often suffered from a geometry that was easy to best classically because they encouraged memorization, instead of understanding. This inspired us to develop a projected quantum kernel, in which the quantum embedding is projected back to a classical representation. While this representation is still hard to compute with a classical computer directly, it comes with a number of practical advantages in comparison to staying in the quantum space entirely.

Geometric quantity g, which quantifies the potential for quantum advantage, depicted for several embeddings, including the projected quantum kernel introduced here.

By selectly projecting back to classical space, we can retain aspects of the quantum geometry that are still hard to simulate classically, but now it is much easier to develop distance functions, and hence kernels, that are better behaved with respect to modest changes in the input than was the original quantum kernel. In addition the projected quantum kernel facilitates better integration with powerful non-linear kernels (like a squared exponential) that have been developed classically, which is much more challenging to do in the native quantum space.

This projected quantum kernel has a number of benefits over previous approaches, including an improved ability to describe non-linear functions of the existing embedding, a reduction in the resources needed to process the kernel from quadratic to linear with the number of data points, and the ability to generalize better at larger sizes. The kernel also helps to expand the geometric g, which helps to ensure the greatest potential for quantum advantage.

Data Sets Exhibit Learning Advantages
The geometric test quantifies potential advantage for all possible label functions, however in practice we are most often interested in specific label functions. Using learning theoretic approaches, we also bound the generalization error for specific tasks, including those which are definitively quantum in origin. As the advantage of a quantum computer relies on its ability to use many qubits simultaneously but previous approaches scale poorly in number of qubits, it is important to verify the tasks at reasonably large qubit sizes ( > 20 ) to ensure a method has the potential to scale to real problems. For our studies we verified up to 30 qubits, which was enabled by the open source tool, TensorFlow-Quantum, enabling scaling to petaflops of compute.

Interestingly, we showed that many naturally quantum problems, even up to 30 qubits, were readily handled by classical learning methods when sufficient data were provided. Hence one conclusion is that even for some problems that look quantum, classical machine learning methods empowered by data can match the power of quantum computers. However, using the geometric construction in combination with the projected quantum kernel, we were able to construct a data set that exhibited an empirical learning advantage for a quantum model over a classical one. Thus, while it remains an open question to find such data sets in natural problems, we were able to show the existence of label functions where this can be the case. Although this problem was engineered and a quantum computational advantage would require the embeddings to be larger and more challenging, this work represents an important step in understanding the role data plays in quantum machine learning.

Prediction accuracy as a function of the number of qubits (n) for a problem engineered to maximize the potential for learning advantage in a quantum model. The data is shown for two different sizes of training data (N).

For this problem, we scaled up the number of qubits (n) and compared the prediction accuracy of the projected quantum kernel to existing kernel approaches and the best classical machine learning model in our dataset. Moreover, a key takeaway from these results is that although we showed the existence of datasets where a quantum computer has an advantage, for many quantum problems, classical learning methods were still the best approach. Understanding how data can affect a given problem is a key factor to consider when discussing quantum advantage in learning problems, unlike traditional computation problems for which that is not a consideration.

Conclusions
When considering the ability of quantum computers to aid in machine learning, we have shown that the availability of data fundamentally changes the question. In our work, we develop a practical set of tools for examining these questions, and use them to develop a new projected quantum kernel method that has a number of advantages over existing approaches. We build towards the largest numerical demonstration to date, 30 qubits, of potential learning advantages for quantum embeddings. While a complete computational advantage on a real world application remains to be seen, this work helps set the foundation for the path forward. We encourage any interested readers to check out both the paper and related TensorFlow-Quantum tutorials that make it easy to build on this work.

Acknowledgements
We would like to acknowledge our co-authors on this paper — Michael Broughton, Masoud Mohseni, Ryan Babbush, Sergio Boixo, and Hartmut Neven, as well as the entirety of the Google Quantum AI team. In addition, we acknowledge valuable help and feedback from Richard Kueng, John Platt, John Preskill, Thomas Vidick, Nathan Wiebe, Chun-Ju Wu, and Balint Pato.


1Current affiliation — Institute for Quantum Information and Matter and Department of Computing and Mathematical Sciences, Caltech, Pasadena, CA, USA

Categories
Misc

Retrieve Similar Images

I am trying to build a similar image retrieval system where given an image, the system is able to show top ‘k’ most similar images to it. For this particular example, I am using the DeepFashion dataset where given an image containing say a shirt, you show top 5 clothes most similar to a shirt. A subset of this has 289,222 diverse clothes images in it. Each image is of shape: (300, 300, 3).

The approach I have includes:

  1. Train an autoencoder
  2. Feed each image in the dataset through the encoder to get a reduced n-dimensional latent space representation. For example, it can be 100-d latent space representation
  3. Create a table of shape m x (n + 2) where ‘m’ is the number of images and each image is compressed to n-dimensions. One of the column is the image name and the other column is a path to where the image is stored on your local system
  4. Given a new image, you feed it through the encoder to get the n-dimensional latent space representation
  5. Use something like cosine similarity, etc to compare the n-d latent space for new image with the table m x (n + 2) obtained in step 3 to find/retrieve top k closest clothes

How do I create the table mentioned in step 3?

I am planning on using TensorFlow 2.5 with Python 3.8 and the code for getting an image generator is as follows:

image_generator = ImageDataGenerator( rescale = 1./255, rotation_range = 135) train_data_gen = image_generator.flow_from_directory( directory = train_dir, batch_size = batch_size, shuffle = False, target_size = (IMG_HEIGHT, IMG_WIDTH), class_mode = 'sparse' 

How can get image name and path to image to create the m x (n + 2) table in step 3?

Also, is there any other better way that I am missing out on?

Thanks!

submitted by /u/grid_world
[visit reddit] [comments]

Categories
Misc

Severe underfitting CNN models

I am building an image classifier of sorts that takes an image of a speedometer and “reads” the value. I have a collection of about 4000 images all labeled with GPS velocity values. I read in the images and create the X and Y training and validation sets. However, the model doesn’t learn at all. I even tried pre-built models like Resnet50 that tensorflow has and Xception. Both of which gave similar if not identical outputs where the loss and accuracy were constant. When I added regularization it made it much worse where the accuracy was still fixed close to zero and the loss sky rocketed over 1,000,000. I realize that there is no “silver bullet” when tuning a neural network so all suggestions are welcome

“`python

##########################import dependencies

import imp import tensorflow as tf import matplotlib.pyplot as plt import numpy as np import os import sys import sklearn as sk import cv2 import pandas as pd import scipy from scipy.signal import fftconvolve from tokenize import endpats from glob import glob from os.path import join, basename from tensorflow.keras import layers, models, datasets from tensorflow import keras from keras import regularizers from sklearn.model_selection import train_test_split

#####################Create different models to test User defined model that can be changed to experiment with different structures input_shape is important. Make sure you either resize all images to match or, if your images are too large and would lose a lot of data, you can change the input shape here to either match you images or you can change the img.resize() function in the load_data functions. If you choose to keep your image size, you will need to remove the resize() functions from the load data functions and you will have to set the width and height to match your images The third dimension specifies the number of channels in your image. If you want to load color images into models that have the channels set to 1 (e.g create_model()), you will have to change the 1 to a 3.

def create_model(): model = keras.models.Sequential() model.add(layers.Conv2D(50, (3, 3), activation=’relu’, input_shape=(90, 160, 1))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(25, (3, 3), activation=’relu’)) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(10, (3, 3), activation=’relu’)) model.add(layers.Flatten()) model.add(layers.Dense(20, activation=’relu’)) model.add(layers.Dropout(0.1)) model.add(layers.Dense(10, activation=’relu’)) model.add(layers.Dense(1, activation=’softmax’)) model.compile(optimizer=’adam’, loss=”categorical_crossentropy”, metrics=[‘accuracy’]) return model

resnet architecture

def resnet(): base_model = keras.applications.resnet50.ResNet50( weights=’imagenet’, include_top=False, input_shape=[90, 160, 3]) avg = keras.layers.GlobalAveragePooling2D()(base_model.output) output = keras.layers.Dense(1, activation=’softmax’)(avg) model = keras.Model(inputs=base_model.input, outputs=output)

for layer in base_model.layers: layer.trainable = False optimizer = keras.optimizers.SGD() model.compile(loss="MSE", optimizer=optimizer, metrics=["accuracy"]) return model 

not exact but based off of alexNet architecture

def alexNet(): model = keras.models.Sequential() model.add(layers.Conv2D(96, (11, 11), activation=’relu’, padding=’valid’, input_shape=(90, 160, 1))) model.add(layers.MaxPooling2D((3, 3), strides=2, padding=’valid’)) model.add(layers.Conv2D(128, (5, 5), activation=’relu’, padding=’same’)) model.add(layers.MaxPooling2D((3, 3), strides=2, padding=’valid’)) model.add(layers.Conv2D(128, (3, 3), activation=’relu’, padding=’same’, strides=1)) model.add(layers.Conv2D(128, (3, 3), activation=’relu’, padding=’same’, strides=1)) model.add(layers.Conv2D(128, (3, 3), activation=’relu’, padding=’same’, strides=1)) model.add(layers.Flatten()) model.add(layers.Dense(2048, activation=’relu’)) model.add(layers.Dense(1024, activation=’relu’)) model.add(layers.Dense(1, activation=’softmax’)) model.compile(optimizer=’adam’, loss=”MSE”, metrics=[‘accuracy’]) return model

lenet5 architecture

def lenet5(): model = keras.models.Sequential([ keras.layers.Conv2D(6, 5, activation=’tanh’, padding=”same”, input_shape=[90, 160, 1]), keras.layers.MaxPooling2D(2, strides=2), keras.layers.Conv2D(16, 5, activation=’tanh’, padding=”same”), keras.layers.MaxPooling2D(2, strides=2), keras.layers.Conv2D(120, 5, activation=’tanh’, padding=”same”), keras.layers.Flatten(), keras.layers.Dense(84, activation=’tanh’), keras.layers.Dense(1, activation=’softmax’), ]) optimizer = keras.optimizers.SGD() model.compile(loss=”MSE”, optimizer=optimizer, metrics=[“accuracy”]) return model

xception architechture

def xception(): base_model = tf.keras.applications.xception.Xception( include_top=False, weights=’imagenet’, input_shape=[90, 160, 3]) avg = keras.layers.GlobalAveragePooling2D()(base_model.output) output = keras.layers.Dense(1, activation=’softmax’)(avg) model = keras.Model(inputs=base_model.input, outputs=output) optimizer = keras.optimizers.SGD() model.compile(loss=”MSE”, optimizer=optimizer, metrics=[“accuracy”]) return model

##########################Filtering Functions Input: single image array Output: Single image array with detected edges

def edges_single_img(img): # define the vertical filter vertical_filter = [[-1, -2, -1], [0, 0, 0], [1, 2, 1]]

# define the horizontal filter horizontal_filter = [[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]] n, m, d = img.shape edges_img = img.copy() # loop over all pixels in the image for row in range(3, n-2): for col in range(3, m-2): # create little local 3x3 box local_pixels = img[row-1:row+2, col-1:col+2, 0] # apply the vertical filter vertical_transformed_pixels = vertical_filter*local_pixels # remap the vertical score vertical_score = vertical_transformed_pixels.sum()/4 # apply the horizontal filter horizontal_transformed_pixels = horizontal_filter*local_pixels # remap the horizontal score horizontal_score = horizontal_transformed_pixels.sum()/4 # combine the horizontal and vertical scores into a total edge score edge_score = (vertical_score**2 + horizontal_score**2)**.5 # insert this edge score into the edges image edges_img[row, col] = [edge_score]*3 # remap the values in the 0-1 range in case they went out of bounds edges_img = edges_img/edges_img.max() return edges_img 

##################untested function Input: Multi-dimensional array of images Output: Multi-dimensional array with detected edges

def edges_array(img_array): # define the vertical filter vertical_filter = [[-1, -2, -1], [0, 0, 0], [1, 2, 1]]

# define the horizontal filter horizontal_filter = [[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]] w = len(img_array[0, :, 0]) p = len(img_array[0, 0, :]) l = len(img_array[:, 0, 0]) edge_img_array = np.zeros((l, w, p)) for i in range(len(img_array[:, 0, 0])): img = img_array[i, :, :] n, m, d = img.shape edges_img = img.copy() # loop over all pixels in the image for row in range(3, n-2): for col in range(3, m-2): # create little local 3x3 box local_pixels = img[row-1:row+2, col-1:col+2, 0] # apply the vertical filter vertical_transformed_pixels = vertical_filter*local_pixels # remap the vertical score vertical_score = vertical_transformed_pixels.sum()/4 # apply the horizontal filter horizontal_transformed_pixels = horizontal_filter*local_pixels # remap the horizontal score horizontal_score = horizontal_transformed_pixels.sum()/4 # combine the horizontal and vertical scores into a total edge score edge_score = (vertical_score**2 + horizontal_score**2)**.5 # insert this edge score into the edges image edges_img[row, col] = [edge_score]*3 # remap the values in the 0-1 range in case they went out of bounds edges_img = edges_img/edges_img.max() edge_img_array[i, :, :] = edges_img return edge_img_array 

#############User input functions

def get_mode(): print(“Operating modes:n 1. Run all modelsn 2. Run single modeln “) mode = input(“Enter Operating mode: “) mode = int(mode) if (mode > 2 or mode < 1): print(“Error 1: Invalid Operating mode! Please enter a valid operating mode”) return mode

def model_choice(mode): if (mode == 1): print(“Loading all models…”) model_num = 0 return model_num elif (mode == 2): print(“Available Models:n 1. Custom Model n 2. lenet5 n 3. AlexNet Variant n 4. ResNet50 n 5. Xception”) model_num = input(“Enter the model that you would like to test: “) model_num = int(model_num) if (int(model_num) > 5 or int(model_num) < 1): print(“Error 2: Invalid model! Please enter a valid model”) else: return model_num

else: print("Error 3: Invalid mode passed") 

############Load and parse data (preprocessing) Input: these functions take a file path as their input. This is the path to the directory with ALL images Important note: image names must follow the naming convention of “img#####_##.##.jpeg” the first 5 numbers are for image index the next 2 are the tens and ones place of velocity and the final two are the tenths and hundreths of the velocity if you wish to use a different naming convention, please edit the “create_labels” function to use you naming style Output: preprocessed, labeled data creates a list of all file names in the direcotry and sorts them

def createlabels(path_to_imgs): files = glob(join(path_to_imgs, ‘*’, ‘*.jpg’), recursive=True) files.sort(key=basename) labels = [] for x in files: start = x.find(‘_’) end = x.find(‘.j’) labels.append(float(x[start+1:end])) labels = np.asarray(labels, dtype=float) return labels, files

loads color images and coverts to grayscale In loaddata*() functions, you can change the w and h variables to match your images if you do not want to resize, or you can change to whatever size works best for your images

def load_data_gray(path_to_imgs): w = 16 * 10 h = 9 * 10 labels, files = create_labels(path_to_imgs) num_imgs = len(files) temp_array = np.zeros((num_imgs, h, w)) for idx, path in enumerate(files): img = cv2.imread(path) img = cv2.resize(img, (w, h)) img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) temp_array[idx, :, :] = img temp_array = temp_array.reshape(num_imgs, 90, 160, -1) return temp_array, labels

assumes library of images is already in grayscale

def load_data_asGray(path_to_imgs): w = 16 * 10 h = 9 * 10 labels, files = create_labels(path_to_imgs) num_imgs = len(files) temp_array = np.zeros((num_imgs, h, w)) for idx, path in enumerate(files): img = cv2.imread(path) img = cv2.resize(img, (w, h)) temp_array[idx, :, :] = img temp_array = temp_array.reshape(num_imgs, 90, 160, -1) return temp_array, labels

loads color images

def load_data_color(path_to_imgs): w = 16 * 10 h = 9 * 10 labels, files = create_labels(path_to_imgs) num_imgs = len(files) temp_array = np.zeros((num_imgs, h, w, 3)) for idx, path in enumerate(files): img = cv2.imread(path) img = cv2.resize(img, (w, h)) temp_array[idx, :, :, :] = img temp_array = temp_array.reshape(num_imgs, 90, 160, 3) return temp_array, labels

def custom_train_test_split(img_arr, labels): X_train, X_test, y_train, y_test = train_test_split( img_arr, labels, test_size=0.2, random_state=42) return X_train, X_test, y_train, y_test

#####################Training functions Input: For some models that only use gray scale images, we input the trainingImages and trainingLabels so that data only as to loaded once For models that require color images (resnet) we do not input the trainingImages, and trainingLabels those are loaded inside the training function Output: an array of training data

def train_Xception_model(path_to_imgs): model = xception() print(model.summary()) trainingImages, trainingLabels = load_data_color(path_to_imgs) history = model.fit(trainingImages, trainingLabels, epochs=10) return history.history

def train_custom_model(trainingImages, trainingLabels): model = create_model() print(model.summary()) X_train, X_test, y_train, y_test = custom_train_test_split(trainingImages, trainingLabels) history = model.fit(X_train, y_train, epochs=100, batch_size=100, validation_data=(X_test, y_test)) return history.history

def train_resnet_model(path_to_imgs): model = resnet() trainingImages, trainingLabels = load_data_color(path_to_imgs) print(model.summary()) history = model.fit(trainingImages, trainingLabels, epochs=10) return history.history

def train_alexnet_model(trainingImages, trainingLabels): model = alexNet() print(model.summary()) history = model.fit(trainingImages, trainingLabels, epochs=10) return history.history

def train_lenet5_model(trainingImages, trainingLabels): model = lenet5() print(model.summary()) history = model.fit(trainingImages, trainingLabels, epochs=10) return history.history

Input: mode to operate in, model_num to determine which model train if mode != 1 Output: Array(s) of training data

def training(mode, model_num, path_to_imgs): trainingImages, trainingLabels = load_data_gray(path_to_imgs) if (mode == 1): history_custom = train_custom_model(trainingImages, trainingLabels) history_alex = train_alexnet_model(trainingImages, trainingLabels) history_lenet5 = train_lenet5_model(trainingImages, trainingLabels) history_resnet = train_resnet_model(path_to_imgs) history_xception = train_Xception_model(path_to_imgs) # df1 = pd.DataFrame(history_custom.history) # df1.to_excel(“custom_model_training.xlsx”) # df2 = pd.DataFrame(history_alex.history) # df2.to_excel(“alexnet_model_training.xlsx”) # df3 = pd.DataFrame(history_lenet5.history) # df3.to_excel(“lenet5_model_training.xlsx”) # df4 = pd.DataFrame(history_resnet.history) # df4.to_excel(“resnet_model_training.xlsx”) # df5 = pd.DataFrame(history_xception.history) # df5.to_excel(“xception_model_training.xlsx”) return history_custom, history_lenet5, history_alex, history_resnet, history_xception elif (mode == 2): if (model_num == 1): history = train_custom_model(trainingImages, trainingLabels) # df = pd.DataFrame(history.history) # df.to_excel(“custom_model_training.xlsx”) return history elif (model_num == 2): history = train_lenet5_model(trainingImages, trainingLabels) # df = pd.DataFrame(history.history) # df.to_excel(“lenet5_model_training.xlsx”) return history elif (model_num == 3): history = train_alexnet_model(trainingImages, trainingLabels) # df = pd.DataFrame(history.history) # df.to_excel(“alexnet_model_training.xlsx”) return history elif (model_num == 4): history = train_resnet_model(path_to_imgs) # df = pd.DataFrame(history.history) # df.to_excel(“resnet_model_training.xlsx”) return history elif (model_num == 5): history = train_Xception_model() # df = pd.DataFrame(history.history) # df.to_excel(“xception_model_training.xlsx”) else: print(“Invalid model number! Please choose againn”) else: print(“Invalid mode! Please choose againn”)

#################Adjust Tensorflow settings to run on GPU Input: Boolean to tell Tensorflow to use GPU acceleration or to strictly use CPU Output: Void

def set_tf_settings(use_GPU): if (use_GPU == False): os.environ[“CUDA_VISIBLE_DEVICES”] = “-1” else: physical_devices = tf.config.list_physical_devices(‘GPU’) try: tf.config.experimental.set_memory_growth(physical_devices[0], True) print(tf.config.experimental.get_device_details(physical_devices[0])) print(tf.config.experimental.get_memory_usage)

 except: # Invalid device or cannot modify virtual devices once initialized. pass print(physical_devices[0]) 

#######################Export Model Functions Input: Tensorflow model instance and the name that you want to give it Output: A JSON file for model architecture and an h5py file for weight values both can be loaded back into a file using the inverse function json_file = open(‘model.json’, ‘r’) loaded_model_json = json_file.read() json_file.close() loaded_model = model_from_json(loaded_model_json) loaded_model.load_weights(“model.h5”)

def export_model(model, model_name): model_json = model.to_json() with open(model_name + “.json”, “w”) as json_file: json_file.write(model_json) print(“Model saved!”) model.save_weights(model_name+”_weights.h5″) print(“weights saved!”)

PATH = ‘/home/[name]/Git/speedometer-data/imgs/’

def main(): mode = get_mode() model_num = model_choice(mode) if (mode == 2): history = training(mode, model_num, PATH) print(history.history) elif (mode == 1): history_custom, history_lenet5, history_alex, history_resnet, history_xception = training( mode, model_num) print(history_custom) print(history_lenet5) print(history_alex) print(history_resnet) print(history_xception) else: print(“Error 4: Invalid Mode Selected!”)

main()

“`

submitted by /u/NameError-undefined
[visit reddit] [comments]

Categories
Misc

NVIDIA Research: Appearance-Driven Automatic 3D Model Simplification

NVIDIA will be presenting a new paper introducing a new method for generating level-of-detail of complex models, taking both geometry and surface appearance into account.

NVIDIA will be presenting a new paper titled “Appearance-Driven Automatic 3D Model Simplification” at Eurographics Symposium on Rendering 2021 (EGSR), June 29-July 2, introducing a new method for generating level-of-detail of complex models, taking both geometry and surface appearance into account.

Level-of-detail for aggregate geometry, where we represent each leaf as a semi-transparent textured quad. The geometrical complexity is greatly reduced, to just 0.4% of the original triangle count, with little visual impact.

Level-of-detail has long been used in computer games as a means of improving performance and reducing aliasing artifacts that may occur due to shading detail or small geometric features. Traditional approaches to level of detail include mesh simplification, normal map baking, and shading/BSDF prefiltering. Each problem is typically tackled in isolation.

We approach level-of-detail entirely in image space, with our optimization objective being “does a simplified model look like the reference when rendered from a certain distance?” (i.e., we use a standard image loss). This perspective is not entirely new, but recent advances in differentiable rendering have transformed it from a theoretical exercise to something highly practical, with excellent performance. We propose an efficient inverse rendering method and system that can be used to simultaneously optimize shape and materials to generate level-of-detail models, or clean up the result of automatic simplification tools.

Approaching model simplification through inverse rendering lets us unify previous methods into a single system, optimizing for a single loss. This is important, because the system can negotiate which rendering term is best suited to represent a detail. An example is shown in the image below, where we create a simplified version of the Ewer statue. By using normal mapping in the inverse rendering setup, the system automatically determines which features are best represented by geometry, and which can be represented by the normal map.

Normal Map
Our (7k tris)
Reference (300k tris)

We show that our method is applicable to a wide range of applications including level-of-detail, normal and displacement map baking, shape and appearance prefiltering and simplification of aggregate geometry, all while supporting animated geometry. We can additionally convert between surface representations, e.g. convert an implicit surface to a mesh, different material representations and different renderers.

Refer to the paper and supplemental material for full results. Our source code is publicly available at GitHub.

Learn more: Check out the project website.