Categories
Misc

Different Types of Edge Computing

The types of edge computing and examples of use cases for each.

Many organizations have started their journey towards edge computing to take advantage of data produced at the edge. The definition of edge computing is quite broad. Simply stated, it is moving compute power physically closer to where data is generated, usually an edge device or IoT sensor.

This encompasses far edge scenarios like mobile devices and smart sensors, as well as more near edge use cases like micro-data centers and remote office computing. In fact, this definition is so broad that it is often talked about as anything outside of the cloud or main data center. 

With such a wide variety of use cases, it is important to understand the different types of edge computing and how they are being used by organizations today. 

Provider edge

The provider edge is a network of computing resources accessed by the Internet. It is mainly used for delivering services from telcos, service providers, media companies, or other content delivery network (CDN) operators. Examples of use cases include content delivery, online gaming​, and AI as a service (AIaaS). 

One key example of the provider edge that is expected to grow rapidly is augmented reality (AR) and virtual reality (VR). Service providers want to find ways to deliver these use cases, commonly known as eXtended Reality (XR), from the cloud to end user edge systems. 

In late 2021, Google partnered with NVIDIA to deliver high-quality XR streaming from Google Cloud NVIDIA RTX powered servers, to lightweight mobile XR displays. By using NVIDIA CloudXR to stream from the provider edge, users can securely access data from the cloud at any time and easily share high-fidelity, full graphics immersive XR experiences with other teams or customers.

Enterprise edge

The enterprise edge is an extension of the enterprise data center, consisting of things like data centers at remote office sites, micro-data centers, or even racks of servers sitting in a compute closet on a factory floor. This environment is generally owned and operated by IT as they would a traditional centralized data center, though there may be space or power limitations at the enterprise edge that change the design of these environments.

Retailers can use edge AI across their business for frictionless shopping, in store analytics as well as supply chain optimization.
Figure 1. Enterprises across all industries use edge AI to drive more intelligent use cases on site.

Looking at examples of the enterprise edge, you can see workloads like intelligent warehouses and fulfillment centers. Improved efficiency and automation of these environments requires robust information, data, and operational technologies to enable AI solutions like real-time product recognition.

Kinetic Vision helps customers build AI for these enterprise edge environments using a digital twin, or photorealistic virtual version, of a fulfillment or distribution center to train and optimize a classification model that is then deployed in the real world. This powers faster, more agile product inspections, and order fulfillments.

Industrial edge

The industrial edge, sometimes called the far edge, generally has smaller compute instances that can be one or two small, ruggedized servers or even embedded systems deployed outside of any sort of data center environment.

Industrial edge use cases include robotics, autonomous checkout, smart city capabilities like traffic control, and intelligent devices. These use cases run entirely outside of the normal data center structure, which means there are a number of unique challenges for space, cooling, security, and management.

BMW is leading the way with industrial edge by adopting robotics to redefine their factory logistics. Using different robots for parts of the process, these robots take boxes of raw parts on the line and transport them to shelves to await production. They are then taken to manufacturing, and finally returned back to the supply area when empty.

Robotics use cases require compute power both in the autonomous machine itself, as well as compute systems that sit on the factory floor. To optimize the efficiency and accelerate deployment of these solutions, NVIDIA introduced the NVIDIA Isaac Autonomous Mobile Robot (AMR) platform.

Accelerating edge computing

Each of these edge computing scenarios has different requirements, benefits, and deployment challenges. To understand if your use case would benefit from edge computing, download the Considerations for Deploying AI at the Edge whitepaper.

Sign up for Edge AI News to stay up to date with the latest trends, customers use cases, and technical walkthroughs.

Categories
Misc

Jaguar Land Rover Announces Partnership With NVIDIA

As part of Jaguar Land Rover’s Reimagine strategy, partnership will transform the modern luxury experience for customers starting in 2025Software experts from both companies will jointly …

Categories
Misc

The Greatest Podcast Ever Recorded

Is this the best podcast ever recorded? Let’s just say you don’t need a GPU to know that’s a stretch. But it’s pretty great if you’re a fan of tall tales. And better still if you’re not a fan of stretching the truth at all. That’s because detecting hyperbole may one day get more manageable, Read article >

The post The Greatest Podcast Ever Recorded appeared first on The Official NVIDIA Blog.

Categories
Misc

Reimagining Modern Luxury: NVIDIA Announces Partnership with Jaguar Land Rover

Jaguar Land Rover and NVIDIA are redefining modern luxury, infusing intelligence into the customer experience. As part of its Reimagine strategy, Jaguar Land Rover announced today that it will develop its upcoming vehicles on the full-stack NVIDIA DRIVE Hyperion 8 platform, with DRIVE Orin delivering a wide spectrum of active safety, automated driving and parking Read article >

The post Reimagining Modern Luxury: NVIDIA Announces Partnership with Jaguar Land Rover appeared first on The Official NVIDIA Blog.

Categories
Misc

Adding node and cell names for tensorboard graph

Adding node and cell names for tensorboard graph

I am trying to trace the tensorboard graph for https://github.com/promach/gdas

However, from what I can observe so far, the tensorboard graph does not really indicate user-understandable node names and cell names which makes it so difficult for tracking down the connections within the graph.

Any suggestions ?

Note: I am using torch.utils.tensorboard

https://preview.redd.it/3ou8jst067i81.png?width=1202&format=png&auto=webp&s=ee8adf7916b89371267d71a62518f3b042692c20

submitted by /u/promach
[visit reddit] [comments]

Categories
Misc

Atos Previews Energy-Efficient, AI-Augmented Hybrid Supercomputer

Stepping deeper into the era of exascale AI, Atos gave the first look at its next-generation high-performance computer. The BullSequana XH3000 combines Atos’ patented fourth-generation liquid-cooled HPC design with NVIDIA technologies to deliver both more performance and energy efficiency. Giving users a choice of Arm or x86 computing architectures, it will come in versions using Read article >

The post Atos Previews Energy-Efficient, AI-Augmented Hybrid Supercomputer appeared first on The Official NVIDIA Blog.

Categories
Misc

Tensorflow error "W tensorflow/core/data/root_dataset.cc:163] Optimization loop failed: CANCELLED: Operation was cancelled"

Hi, I’m trying to train a GAN on the mnist fashion dataset. But whenever I train the thing it keeps returning each epoch

W tensorflow/core/data/root_dataset.cc:163] Optimization loop failed: CANCELLED: Operation was cancelled W tensorflow/core/data/root_dataset.cc:163] Optimization loop failed: CANCELLED: Operation was cancelled W tensorflow/core/data/root_dataset.cc:163] Optimization loop failed: CANCELLED: Operation was cancelled W tensorflow/core/data/root_dataset.cc:163] Optimization loop failed: CANCELLED: Operation was cancelled W tensorflow/core/data/root_dataset.cc:163] Optimization loop failed: CANCELLED: Operation was cancelled W tensorflow/core/data/root_dataset.cc:163] Optimization loop failed: CANCELLED: Operation was cancelled 

It doesn’t seem to actually train every time l look at the generated images. Here is the code:

import tensorflow as tf from tensorflow import keras import matplotlib.pyplot as plt import numpy as np (X_train, y_train), (X_test, Y_test) = keras.datasets.fashion_mnist.load_data() X_train = X_train // 255.0 def plot_multiple_images(images, n_cols=None): n_cols = n_cols or len(images) n_rows = (len(images) - 1) // n_cols + 1 if images.shape[-1] == 1: images = np.squeeze(images, axis=-1) plt.figure(figsize=(n_cols, n_rows)) for index, image in enumerate(images): plt.subplot(n_rows, n_cols, index + 1) plt.imshow(image, cmap="binary") plt.axis("off") # np.random.seed(42) tf.random.set_seed(42) codings_size = 30 generator = keras.models.Sequential([ keras.layers.Dense(100, activation="selu", input_shape=[codings_size]), keras.layers.Dense(150, activation="selu"), keras.layers.Dense(28 * 28, activation="sigmoid"), keras.layers.Reshape([28, 28]) ]) discriminator = keras.models.Sequential([ keras.layers.Flatten(input_shape=[28, 28]), keras.layers.Dense(150, activation="selu"), keras.layers.Dense(100, activation="selu"), keras.layers.Dense(1, activation="sigmoid") ]) gan = keras.models.Sequential([generator, discriminator]) discriminator.compile(loss="binary_crossentropy", optimizer="rmsprop") discriminator.trainable = False gan.compile(loss="binary_crossentropy", optimizer="rmsprop") batch_size = 32 dataset = tf.data.Dataset.from_tensor_slices(X_train).shuffle(1000) dataset = dataset.batch(batch_size, drop_remainder=True).prefetch(1) batch_size = 32 def train_gan(gan, dataset, batch_size, codings_size, n_epochs=50): generator, discriminator = gan.layers for epoch in range(n_epochs): print("Epoch {}/{}".format(epoch + 1, n_epochs)) for X_batch in dataset: # phase 1 - training the discriminator noise = tf.random.normal(shape=[batch_size, codings_size]) generated_images = generator(noise) X_batch = tf.cast(X_batch, tf.float32) X_fake_and_real = tf.concat([generated_images, X_batch], axis=0) y1 = tf.constant([[0.]] * batch_size + [[1.]] * batch_size) discriminator.trainable = True discriminator.train_on_batch(X_fake_and_real, y1) # phase 2 - training the generator noise = tf.random.normal(shape=[batch_size, codings_size]) y2 = tf.constant([[1.]] * batch_size) discriminator.trainable = True gan.train_on_batch(noise, y2) # not shown plot_multiple_images(generated_images, 8) plt.show() train_gan(gan, dataset, batch_size, codings_size) 

I’m using Aurelien Geron’s “Hands-on Machine Learning with Scikit-Learn, Keras & Tensorflow”.

Here is the repository.

Hope this is the right sub for questions.

submitted by /u/General_wolffe
[visit reddit] [comments]

Categories
Misc

RNN with three or two LSTM-Layers ? Is the first LSTM-Layer simultan the Input-Layer?

RNN with three or two LSTM-Layers ? Is the first LSTM-Layer simultan the Input-Layer?

Hey guys, i am struggling with the definition of the quantity of hidden layers of these RNN (see below). Contain this model three or two hidden LSTM-Layers? I’m not sure. Is the first LSTM-Layer simultan the Input-Layer or are there an additional Input Layer before the first LSTM-Layer?

Can anyone help me?

https://preview.redd.it/68svyk5hx5i81.png?width=605&format=png&auto=webp&s=628761d42413223d7d8521c81ce438516092abc5

submitted by /u/Necessary_Radish9885
[visit reddit] [comments]

Categories
Misc

LDA in tensorflow

HIII I HOPE SOMEONE CAN REALLY REALLY HELP ME WITH THIS:) Do I need to reduce dimension using LDA before passing it into my Tensorflow Training/testing model? I can’t really find any tensorflow lda function code online, I am currently using sk.learn’s LDA then pass the data into tensorflow. Is that ok? Or do I not even need to reduce dimensionality at all if I use tensorflow? 🙂 Thankssss

submitted by /u/Any-Chocolate1995
[visit reddit] [comments]

Categories
Offsites

Good News About the Carbon Footprint of Machine Learning Training

Machine learning (ML) has become prominent in information technology, which has led some to raise concerns about the associated rise in the costs of computation, primarily the carbon footprint, i.e., total greenhouse gas emissions. While these assertions rightfully elevated the discussion around carbon emissions in ML, they also highlight the need for accurate data to assess true carbon footprint, which can help identify strategies to mitigate carbon emission in ML.

In “The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink”, accepted for publication in IEEE Computer, we focus on operational carbon emissions — i.e., the energy cost of operating ML hardware, including data center overheads — from training of natural language processing (NLP) models and investigate best practices that could reduce the carbon footprint. We demonstrate four key practices that reduce the carbon (and energy) footprint of ML workloads by large margins, which we have employed to help keep ML under 15% of Google’s total energy use.

The 4Ms: Best Practices to Reduce Energy and Carbon Footprints
We identified four best practices that reduce energy and carbon emissions significantly — we call these the “4Ms” — all of which are being used at Google today and are available to anyone using Google Cloud services.

  • Model. Selecting efficient ML model architectures, such as sparse models, can advance ML quality while reducing computation by 3x–10x.
  • Machine. Using processors and systems optimized for ML training, versus general-purpose processors, can improve performance and energy efficiency by 2x–5x.
  • Mechanization. Computing in the Cloud rather than on premise reduces energy usage and therefore emissions by 1.4x–2x. Cloud-based data centers are new, custom-designed warehouses equipped for energy efficiency for 50,000 servers, resulting in very good power usage effectiveness (PUE). On-premise data centers are often older and smaller and thus cannot amortize the cost of new energy-efficient cooling and power distribution systems.
  • Map Optimization. Moreover, the cloud lets customers pick the location with the cleanest energy, further reducing the gross carbon footprint by 5x–10x. While one might worry that map optimization could lead to the greenest locations quickly reaching maximum capacity, user demand for efficient data centers will result in continued advancement in green data center design and deployment.

These four practices together can reduce energy by 100x and emissions by 1000x.

Note that Google matches 100% of its operational energy use with renewable energy sources. Conventional carbon offsets are usually retrospective up to a year after the carbon emissions and can be purchased anywhere on the same continent. Google has committed to decarbonizing all energy consumption so that by 2030, it will operate on 100% carbon-free energy, 24 hours a day on the same grid where the energy is consumed. Some Google data centers already operate on 90% carbon-free energy; the overall average was 61% carbon-free energy in 2019 and 67% in 2020.

Below, we illustrate the impact of improving the 4Ms in practice. Other studies examined training the Transformer model on an Nvidia P100 GPU in an average data center and energy mix consistent with the worldwide average. The recently introduced Primer model reduces the computation needed to achieve the same accuracy by 4x. Using newer-generation ML hardware, like TPUv4, provides an additional 14x improvement over the P100, or 57x overall. Efficient cloud data centers gain 1.4x over the average data center, resulting in a total energy reduction of 83x. In addition, using a data center with a low-carbon energy source can reduce the carbon footprint another 9x, resulting in a 747x total reduction in carbon footprint over four years.

Reduction in gross carbon dioxide equivalent emissions (CO2e) from applying the 4M best practices to the Transformer model trained on P100 GPUs in an average data center in 2017, as done in other studies. Displayed values are the cumulative improvement successively addressing each of the 4Ms: updating the model to Primer; upgrading the ML accelerator to TPUv4; using a Google data center with better PUE than average; and training in a Google Oklahoma data center that uses very clean energy.

Overall Energy Consumption for ML
Google’s total energy usage increases annually, which is not surprising considering increased use of its services. ML workloads have grown rapidly, as has the computation per training run, but paying attention to the 4Ms — optimized models, ML-specific hardware, efficient data centers — has largely compensated for this increased load. Our data shows that ML training and inference are only 10%–15% of Google’s total energy use for each of the last three years, each year split ⅗ for inference and ⅖ for training.

Prior Emission Estimates
Google uses neural architecture search (NAS) to find better ML models. NAS is typically performed once per problem domain/search space combination, and the resulting model can then be reused for thousands of applications — e.g., the Evolved Transformer model found by NAS is open sourced for all to use. As the optimized model found by NAS is often more efficient, the one time cost of NAS is typically more than offset by emission reductions from subsequent use.

A study from the University of Massachusetts (UMass) estimated carbon emissions for the Evolved Transformer NAS.

  • Without ready access to Google hardware or data centers, the study extrapolated from the available P100 GPUs instead of TPUv2s, and assumed US average data center efficiency instead of highly efficient hyperscale data centers. These assumptions increased the estimate by 5x over the energy used by the actual NAS computation that was performed in Google’s data center.
  • In order to accurately estimate the emissions for NAS, it’s important to understand the subtleties of how they work. NAS systems use a much smaller proxy task to search for the most efficient models to save time, and then scale up the found models to full size. The UMass study assumed that the search repeated full size model training thousands of times, resulting in emission estimates that are another 18.7x too high.

The overshoot for the NAS was 88x: 5x for energy-efficient hardware in Google data centers and 18.7x for computation using proxies. The actual CO2e for the one-time search were 3,223 kg versus 284,019 kg, 88x less than the published estimate.

Unfortunately, some subsequent papers misinterpreted the NAS estimate as the training cost for the model it discovered, yet emissions for this particular NAS are ~1300x larger than for training the model. These papers estimated that training the Evolved Transformer model takes two million GPU hours, costs millions of dollars, and that its carbon emissions are equivalent to five times the lifetime emissions of a car. In reality, training the Evolved Transformer model on the task examined by the UMass researchers and following the 4M best practices takes 120 TPUv2 hours, costs $40, and emits only 2.4 kg (0.00004 car lifetimes), 120,000x less. This gap is nearly as large as if one overestimated the CO2e to manufacture a car by 100x and then used that number as the CO2e for driving a car.

Outlook
Climate change is important, so we must get the numbers right to ensure that we focus on solving the biggest challenges. Within information technology, we believe these are much more likely the lifecycle costs — i.e., emission estimates that include the embedded carbon emitted from manufacturing all components involved, from chips to data center buildings — of manufacturing computing equipment of all types and sizes1 rather than the operational cost of ML training.

Expect more good news if everyone improves the 4Ms. While these numbers may currently vary across companies, these simple measures can be followed across the industry:

If the 4Ms become widely recognized, we predict a virtuous circle that will bend the curve so that the global carbon footprint of ML training is actually shrinking, not increasing.

Acknowledgements
Let me thank my co-authors who stayed with this long and winding investigation into a topic that was new to most of us: Jeff Dean, Joseph Gonzalez, Urs Hölzle, Quoc Le, Chen Liang, Lluis-Miquel Munguia, Daniel Rothchild, David So, and Maud Texier. We also had a great deal of help from others along the way for an earlier study that eventually led to this version of the paper. Emma Strubell made several suggestions for the prior paper, including the recommendation to examine the recent giant NLP models. Christopher Berner, Ilya Sutskever, OpenAI, and Microsoft shared information about GPT-3. Dmitry Lepikhin and Zongwei Zhou did a great deal of work to measure the performance and power of GPUs and TPUs in Google data centers. Hallie Cramer, Anna Escuer, Elke Michlmayr, Kelli Wright, and Nick Zakrasek helped with the data and policies for energy and CO2e emissions at Google.



1Worldwide IT manufacturing for 2021 included 1700M cell phones, 340M PCs, and 12M data center servers.