Categories
Misc

Develop, Deploy, and Distribute Immersive Experiences with NVIDIA CloudXR and Amazon Web Services

Use NVIDIA CloudXR alongside AWS to build immersive XR experiences from the cloud for key advantages at every stage from development to distribution.

Creating immersive applications with high-fidelity 3D graphics has never been more accessible thanks to recent advances in extended reality (XR) hardware and software. Despite this growth, developing augmented reality (AR) and virtual reality (VR) applications still come with challenges:

  • Large investments must be made in local development workstations.
  • Computing power of end-user devices is still limited.
  • Deploying applications to distributed users can introduce management and security complexities.

By using NVIDIA CloudXR alongside Amazon NICE DCV streaming protocols, you can use on-demand compute resources for all aspects of your immersive application development. This includes services to support end-to-end workflows, tight control over the security of your data, and simplifying the management of deploying and delivering application updates.

Advantages of AWS and NVIDIA CloudXR

Production-grade interactive XR applications require a collection of supporting technologies to successfully build, deliver, and consume content. The advent of microservices from cloud infrastructure has made it easier to deploy these components to an end-to-end solution. The compute, storage, and networking resources of Amazon Web Services (AWS) can be used to develop, deploy, and distribute XR applications. These can then be remotely rendered and streamed using NVIDIA CloudXR over distributed networks.

Leveraging both AWS and NVIDIA CloudXR presents several advantages:

  • First, reliable cloud networks and security tools provide a consistent security boundary around your valuable data.
  • Second, this solution makes it possible to remotely render high fidelity, low latency, experiences with globally available, graphics accelerated instances.
  • Finally, providing on-demand resources for distributed XR development teams enables you to scale your workforce with little operational overhead.
Architecture including VPN, 3D asset store, virtual workstation, on-demand AppStream fleet, and streaming protocols to the client devices.
Figure 1. Example AWS architecture (click for full size)

Figure 1 demonstrates an example configuration of NVIDIA CloudXR with AWS services to develop, deploy, and distribute XR applications.

CloudXR overview

NVIDIA CloudXR is an SDK that enables you to implement real-time GPU rendering and streaming of rich VR, AR, and XR applications on remote servers, including the cloud. Applications that typically require tethered HMDs can connect and stream remotely on low-powered VR devices or tablets without degrading performance.

NVIDIA CloudXR fulfills the key component of XR streaming and enables AWS customers to use the cloud for end-to-end XR application development. The NVIDIA CloudXR Amazon Machine Image is bundled with RTX Virtual Workstation and currently supports Amazon EC2 G4 instances.

NICE DCV overview

Amazon NICE DCV is a high-performance remote display protocol. It lets you securely deliver remote desktops and application streaming from any cloud or data center to any device, over varying network conditions. By using NICE DCV with Amazon EC2, you can run graphics-intensive applications remotely on Amazon EC2 instances.

Powered by the latest NVIDIA GPUs, NICE DCV delivers low-latency performance for artists and developers building next-generation immersive applications. You can then stream the results to more modest client machines, eliminating the need for expensive dedicated workstations.

Managing content in the cloud

Streaming applications require content in order to be useful. When creating an asset management pipeline for immersive applications in the cloud, AWS services like Amazon Simple Storage Service (S3), Amazon DynamoDB (a NoSQL low-latency database), and Amazon API Gateway serve as core components.

  • S3 becomes the datastore for storing 3D assets and application build artifacts.
  • DynamoDB tables are used for storing and accessing asset metadata and application server information.
  • API Gateway acts as the front door to your data in AWS that your client-side immersive application interacts with.

When your application is ready to be deployed, build artifacts can be sent from S3 to Amazon AppStream 2.0, a fully managed nonpersistent desktop and application-streaming service, or an NVIDIA GPU-equipped Amazon EC2 instance. These instances are also configured with an NVIDIA CloudXR SDK server, which allows them to receive requests from devices loaded with the NVIDIA CloudXR client.

Three images compiled, showing a 3D model with two avatars collaborating, and the two men in VR headsets who are controlling them.
Figure 2. Collaboration in a streaming VR application built by AWS using Unreal Engine’s Collab Viewer Templates.

Summary

By streaming experiences with NVIDIA CloudXR, data never leaves the data center. Globally available AWS instances allow you to render experiences close to your end users, but NVIDIA CloudXR does the heavy lifting to reduce the perceived latency and provide a smooth experience for end users. Using NVIDIA CloudXR as part of your development and testing process means you can build applications for the same target platform for testing as well as final deployment.

Using NVIDIA CloudXR to stream immersive experiences from the AWS cloud, you can deliver high-fidelity graphics to untethered AR and VR devices. With development workstations running NICE DCV on Amazon EC2, you can spin up graphics workstations for your team from any location. The AWS cloud for development, deployment, and distribution of your immersive applications is a highly scalable, secure, and resilient single source of truth for all of your 3D content.

Next steps

To get started with NVIDIA CloudXR on AWS, see the NVIDIA CloudXR AMI listed on the AWS Marketplace.

For more information about running virtual workstations on AWS, see Nimble Studio, which gives you the tools to get up and running quickly, with graphics workstations streamed from the cloud.

Finally, be sure to check out AppStream 2.0, a fully managed service to deploy streaming applications to any type of device.

Categories
Misc

I FUCKING HATE TENSORFLOW · This guy seems very honest

I FUCKING HATE TENSORFLOW · This guy seems very honest submitted by /u/metalwhalecom
[visit reddit] [comments]
Categories
Misc

Keras consumes too much ram instead of gpu

I have a data of 10 Gb in ram and the shape of data is (10000,3000,61). I am training a lstm model. But when I start training with batch size of 10242 the ram usage reach upto 50 GB, but GPU memory is half consumed. If I further increase the batch size , the system get hanged because of 100% ram consumption. Simply I am wasting 50% of my gpu capacity, because with a batch size of 10242, only 5.8 gb gpu is used and I have 12 gb gpu.

submitted by /u/mrtac96
[visit reddit] [comments]

Categories
Misc

TensorFlow Lite Model Maker how work

hi everyone, i was looking for a book / site / someone to explain me how to train a model for tensorflow lite. In particular I can’t figure out how to create a custom model through TensorFlow Lite Model Maker. Can anyone give me some advice?

submitted by /u/nicokingblogger
[visit reddit] [comments]

Categories
Misc

How to Setup a Google Colab account and use it’s GPU or TPU for free?

How to Setup a Google Colab account and use it's GPU or TPU for free? submitted by /u/alizakally
[visit reddit] [comments]
Categories
Misc

Advent of Code 2021 in pure TensorFlow – day 6

Advent of Code 2021 in pure TensorFlow - day 6 submitted by /u/pgaleone
[visit reddit] [comments]
Categories
Misc

MPI+Tensorflow – GPU not detected in MPI process

I want to set up distributed RL. Multiple worker and 1 learner. I have 1 GPU and 1 CPU with multiple cores.

So GPU:0 and CPU:0.

Now if i start the programm normal via python programm.py

It detects the GPU and CPU and lists it once i call tf.config.list_physical_device()

However if i start it as MPI application via:

mpiexec -np 4 python programm.py

Every process just lists CPU:0 and no gpu is detected. How can i make at least one process see the gpu?

I use tf2.7 and mpi4py

submitted by /u/Willing-Classroom735
[visit reddit] [comments]

Categories
Misc

Trying to train a model to generate text on my GPU, and keep getting [UNK] and "NaN" loss on one computer. However, on my MacBook it works just fine?

Here is the full code. This currently works just fine on my M1 MacBook running Monterey and Tensorflow-Metal. However, when I export the dataset and code to my laptop with an RTX 3060 Laptop GPU with Pop_OS! that is when I start getting the [UNK] characters generated and “NaN” loss. I’m unsure of what steps to take to make this better. Any advice would be appreciated.

import os, sys, time import numpy as np import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.callbacks import ModelCheckpoint from tensorflow.keras.optimizers import Adam from tensorflow.keras.preprocessing.sequence import pad_sequences from tensorflow.keras.preprocessing.text import Tokenizer from tensorflow.keras.layers import StringLookup from tensorflow.keras.layers import Embedding from tensorflow.keras.layers import Bidirectional from tensorflow.keras.layers import SimpleRNN from tensorflow.keras.layers import Dense BATCH_SIZE = 128 BUFFER_SIZE = 10_000 EMBEDDING_DIMENSION = 128 RNN_UNITS = 1024 CHECKPOINT_DIR = './training_checkpoints' CHECKPOINT_PREFIX = os.path.join(CHECKPOINT_DIR, "ckpt_{epoch}") EPOCHS = 16 def text_from_ids(ids): return tf.strings.reduce_join(chars_from_ids(ids), axis=1) def split_input_target(sequence): input_text = sequence[:-1] target_text = sequence[1:] return input_text, target_text def generate_text(model, seed_text, next_words, max_sequence_len): for _ in range(next_words): token_list = Tokenizer().texts_to_sequences([seed_text])[0] token_list = pad_sequences([token_list], maxlen=max_sequence_len-1, padding='pre') predicted = model.predict(token_list, verbose=0) output_word = "" for word,index in Tokenizer().word_index.items(): if index == predicted: output_word = word break seed_text += " "+output_word return seed_text.title() def generate_char(inputs): input_ids = tf.convert_to_tensor(ids_from_chars(inputs)) predicted_logits = model(inputs=np.array([input_ids])) predicted_logits = predicted_logits[:, -1, :] # print(predicted_logits) predicted_logits = predicted_logits/1.0 # print(predicted_logits) predicted_ids = tf.random.categorical(predicted_logits, num_samples=1) predicted_ids = tf.squeeze(predicted_ids, axis=-1) return chars_from_ids(predicted_ids) text = open("./data.txt", "rb").read().decode(encoding="UTF-8") vocab = sorted(set(text)) vocab_size = len(vocab) print(f"Text Length: {len(text)}") print(f"Text Vocab: {vocab}") print(f"Text Vocab Size: {vocab_size}") ids_from_chars = StringLookup(vocabulary=list(vocab), mask_token=None, name='lookup') chars_from_ids = StringLookup(vocabulary=ids_from_chars.get_vocabulary(), invert=True, mask_token=None) all_ids = ids_from_chars(tf.strings.unicode_split(text, "UTF-8")) ids_dataset = tf.data.Dataset.from_tensor_slices(all_ids) sequence_length = 100 examples_per_epoch = len(text)//(sequence_length+1) sequences = ids_dataset.batch(sequence_length+1, drop_remainder=True) dataset = sequences.map(split_input_target) dataset = ( dataset.shuffle(BUFFER_SIZE) .batch(BATCH_SIZE, drop_remainder=True) .prefetch(tf.data.experimental.AUTOTUNE) ) model = Sequential() model.add(Embedding(vocab_size, EMBEDDING_DIMENSION, batch_input_shape=[BATCH_SIZE, None])) model.add(SimpleRNN(RNN_UNITS, return_sequences=True)) model.add(Dense(vocab_size,)) checkpoint_callback = ModelCheckpoint( filepath=CHECKPOINT_PREFIX, save_weights_only=False, save_best_only=False, verbose=1 ) loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) model.compile(loss = loss, optimizer='adam', run_eagerly=True) model.summary() model.fit(dataset, batch_size=BATCH_SIZE, epochs=EPOCHS, callbacks=[checkpoint_callback]) model.save("./model/") model = tf.keras.models.load_model("./model/") next_char = tf.constant(["After "]) result = [] for n in range(256): next_char = generate_char(next_char) result.append(next_char) print(tf.strings.join(result)[0].numpy().decode("utf-8")) 

submitted by /u/weepthewillow_
[visit reddit] [comments]

Categories
Misc

My Model doesn’t seem to be learning, and I can’t figure out why

For the record, I’m absolutely expecting to have made some obvious error here, but I can’t seem to find it myself.

For context, I have implemented a very simple version of a game similar to doodle jump, and among other things I have attempted to train a FFNN to predict the correct / best inputs based on the state of the game. I have generated ~15k examples through manual play, each frame having a 25% of being recorded for both state and inputs, and for a slight bit of dataset balancing, half of all frames without any input are discarded.

I’m using a sequential model with 3 dense layers, using 15, 5 and 3 units respectively. I’ve specified a sigmoid activation for the input and hidden layers, and the input shape for the input layer.

For each example, the input consists of 15 scalar values (7 pairs of values representing the distance of a platform to the player sprite in horizontal and vertical direction respectively, plus 1 value for the currently remaining timer. (I’m using a 20 second timer to make comparisons reasonable.)), while the labels consist of 3 integers, each either 1 or 0, representing whether the key associated with that position has been pressed on that frame or not. (Left, Right, Up in that order.) The model compilation specifies the use of the Adam optimzer and MeanSquaredError loss.

What I’m specifically hoping to predict is a set of 3 values, which I can check against a set threshold to determine wheter the associated key should be pressed on that frame.

When training the model, however, I’m seeing no relevant drop in loss over 100 epochs, (most recently it went from 0.1923 to 0.1901), and indeed the trained models behaviour will consistently see it pressin right and jump, with the value for the left key often being negative, which on the one hand seems to indicate extreme underfitting (Since it’s predicting negative values when all example labels were positive), but on the other the sheer regularity with which this occurs might point to an error in my methodology.

I realise that any answer to this will be speculative at best, and that’s absolutely fine. I’ve tried everything I can think of (Varying number and size of the layers, varying the optimizer and/or loss function, varying the threshold, deleting and rebuilding the dataset…), so any ideas wouldbe very welcome.

submitted by /u/Rhoderick
[visit reddit] [comments]

Categories
Misc

Oddly getting a dramatic accuracy difference for the same model when it is deployed on MATLAB compared to TensorFlow & Keras?

As the title is self-descriptive, I’m getting a dramatic accuracy difference for the same model when I deploy it on MATLAB compared to TensorFlow & Keras. My model is actually a basic one, which uses transfer technique to fine-tune a pre-trained model, namely, ResNet50, by excluding the top and adding a Dense layer to utilize it for another classification task. I’ve added the sample code that shows the architecture of the model below. The model was trained under the same hyper-parameters on both platforms. The accuracy values I get on TensorFlow and MATLAB are 0.7455, and 0.424, respectively, on the CIFAR-10. What could be the reason behind this great accuracy difference? Could you please help me?

Model Architecture:

base_model = ResNet50(include_top=False, weights=’imagenet’, input_shape=(75, 75, 3),pooling=’max’, classes=10)

base_model.trainable = False

model = Sequential()

model.add(base_model)

model.add(Dense(10, activation=’softmax’))

submitted by /u/talhak
[visit reddit] [comments]