Categories
Misc

Cooking Up New Network Models with NVIDIA Linux Switch

With Linux Switch on Spectrum, Yandex gained transparency and control over the network, and disaggregated networking hardware and software, and lowered costs.

Picture this: you’re having dinner at an upscale restaurant. You look at the menu and decide that you’re in the mood for a filet. You order the steak medium rare. The waiter brings it out, it’s plated beautifully, and the service is great. Yet you hear a little voice in the back of your head. “I could have prepared this steak in my own kitchen exactly to my definition of medium rare!”

We’ve all had that feeling: to get that perfect outcome, sometimes you’ve got to put in some of the work yourself. Yandex recognized this, and that is why they have partnered with NVIDIA to use NVIDIA Linux Switch on NVIDIA Spectrum Ethernet switches. NVIDIA is uniquely positioned to enable and support pioneers like Yandex while they grow the open networking ecosystem.

Yandex is a Russian internet company. You could describe Yandex as a search engine much like Google, but they do much more than just search. Yandex provides many services to users: music and movie streaming, translation, intelligent personal assistants, and more.

As Anton Kortunov, networking lead at Yandex, explains, “Yandex has several data centers, with each data center containing tens or even hundreds of thousands of servers connected by thousands of switches. Managing this infrastructure is no small task.”

Yandex employs cloud operational models to make the deployment, administration, and automation of the data center as efficient and seamless as possible, at as low a cost as possible.

Technological considerations

In addition to the general requirements of cloud-scale efficiency and economy, Yandex had some key needs that had to be met by any networking solution they went with:

  • Their data centers make extensive use of IPv6; in particular, BGP sessions inside the fabric on IPv6 link-local addresses.
  • QoS and ACL tools were needed, and any networking stack had to integrate with telemetry, monitoring, and automation tools.
  • A network switch infrastructure that supported extensive ZTP, or zero touch provisioning. In particular, the ZTP needed to tie into the monitoring tools, to verify deployment and bring-up success at scale.

Looking at their requirements, Yandex evaluated several options. To do the desired integration and automation, Yandex determined they needed a truly open-source NOS. SONiC was considered but had too many limitations around IPv6 addressing. Enter NVIDIA Linux Switch.

What is Linux Switch?

NVIDIA Linux Switch allows customers to run any Linux distribution as the network operating system on Spectrum Ethernet switches. The secret sauce for Linux Switch is Switchdev, a Linux kernel driver model that Linux Switch is based on.

Much like in the server operating system case, Linux Switch is built for independence. Rather than using proprietary APIs, fully standard Linux kernel interfaces are used to control the switch silicon. This allows the switch and Linux distribution choice to be completely independent, with the switch hardware doing the heavy lifting through offloading. 

NVIDIA Linux Switch consists of hardware platforms, the Linux Kernel space, and a User Space built on Management and Routing Applications as well as the Linux OS.
Figure 1. NVIDIA Linux Switch stack

Linux Switch brings several key benefits to Yandex. As mentioned earlier, Yandex had some key technological requirements that had to be met with any networking OS. Linux Switch provides Yandex the flexibility to customize and optimize the switch to their exact needs, with no extra features driving up cost.

The transparency of Linux Switch and the Linux operating model also allows Yandex full visibility into the distribution, greatly simplifying troubleshooting and debugging. This enabled Yandex to integrate their networking infrastructure with tooling built in-house for automation and configuration management.

By combining the Linux Switch implementation with these custom tools, Yandex had complete control over the feature set without having to build an operating system from scratch.

Why NVIDIA?

As part of the Linux kernel, there is nothing that restricts Switchdev to the NVIDIA Spectrum Ethernet platform. What makes NVIDIA Linux Switch the right choice?

The answer is that the hardware matters. The NVIDIA software-defined, hardware-accelerated approach makes Spectrum a uniquely suitable fit for Switchdev. With each new ASIC and platform, Switchdev support and compatibility is one of the first considerations in the design stage, and every NVIDIA switch platform supports Linux Switch.

In addition, the breadth of switch port speeds (from 1 to 400G) and switch form factors (1/2U, as well as half-wide) enables the Spectrum portfolio to meet any data center networking need, Switchdev or otherwise.

Complementing the optimized hardware portfolio, NVIDIA is a key member of the open-source networking ecosystem. NVIDIA works with the open-source community to support and triage customer issues. Linux Switch optimizations made by NVIDIA engineering are upstreamed as part of all major Linux distributions.

The NVIDIA Spectrum platform has demonstrated a commitment to open Ethernet for almost ten years.
Figure 2. The NVIDIA Open Ethernet journey

With Linux Switch on Spectrum, Yandex gained transparency and control over the network, and disaggregated networking hardware and software, and lowered costs. As Kortunov puts it, “We met our overall goal of letting whitebox switches act like vendor boxes.” Yandex got all the positives of proprietary vendor solutions without all the associated negative baggage that comes along with proprietary lock-in. 

To learn more about Yandex and their networking journey with NVIDIA, see the joint session as part of GTC. From November 8-11, GTC features hundreds of sessions packed with interesting insights and discoveries from NVIDIA customers and partners.

To attend the session with Anton Kortunov and David Iles, register for GTC. The Yandex session will be live on November 11 at 9 AM CET and will be available on-demand afterwards. Enjoy!

Categories
Misc

NVIDIA GTC: A Complete Overview of Nsight Developer Tools

Nsight logoRead a complete overview of the Nsight suite of developer tools with new features and capabilities. Nsight logo

The Nsight suite of Developer Tools provide insightful tracing, debugging, profiling, and other analyses to optimize your complex computational applications across NVIDIA GPUs, and CPUs including x86, Arm, and Power architectures.

Unlocking the Power of GPU Profiler and Debugger: Nsight Systems 2021.5 and Nsight Compute 2021.3

NVIDIA Nsight Systems is a performance analysis tool designed to visualize, analyze and optimize programming models, and tune to scale efficiently across any quantity, or size, of CPUs and GPUs; from workstations to supercomputers.

Nsight Systems 2021.5 highlights include:

  • Statistics now available in graphical user interface (GUI).
  • Multireport view with horizontal and vertical layouts to aid investigations across server nodes, VMs, containers, ranks, and processes (coming soon).
  • Expert system now includes GPU utilization analysis for OpenGL and DX12.
  • NVIDIA NIC InfiniBand metrics sampling (experimental).
  • DirectX12 memory operations and warnings.
  • DXGI/DX12/Vulkan API calls correlation to WDDM queue packets.
  • Windows 11 support.

Learn more and download. >>

NVIDIA Nsight Compute 2021.3 released new features for measuring and modeling occupancy, source and assembly code correlation, and a hierarchical roofline model to identify bottlenecks caused by accessing cache memory.

Key features:

  • Occupancy Calculator – Helps you understand the hardware resource utilization of your kernels, and model how adjustments could impact occupancy.
  • Command line source page – enables accessing the information from the Source page in the GUI directly from the command line. By using the --page source flag, you can see the lines of source, PTX, or assembly and the collected metrics for those lines output on the command line. This feature gives additional flexibility when it comes to analyzing the collected data as well as scripting and post-processing results for further reporting and analysis. 
  • Hierarchical Roofline – The Roofline chart now supports a hierarchical roofline, which represents additional levels in the memory hierarchy, in addition to device memory. You can now see if developed kernels have bottlenecks related to cache memory.

There are additional improvements including more configurable baseline comparisons, access to source-level information from the CLI, and additional SSH functionality.

Download >>


What’s New for Gaming and Graphics Developers: Nsight Graphics 2021.5, Nsight Perf SDK, and Nsight Aftermath SDK

NVIDIA Nsight Graphics is a powerful tool that enables you to debug and profile applications that use Direct3D (11, 12, DXR), Vulkan (1.2, Vulkan Ray Tracing), and OpenGL. It provides the ability to export frames for later analysis, as well as GPU Trace, a powerful profiler that enables you to visualize GPU low-level metrics.

This latest Nsight Graphics 2021.5 release extends support for multiple APIs with the following updates:

  • Full Windows 11 support for both API Capture and Tracing.
  • Acceleration Structure Viewer: Bounding Volume Overlap Analysis.
  • You are able to specify the continual refresh flag through the Nsight API.
  • Support for Linux NGX.

Learn more and download. >>

NVIDIA Nsight Perf SDK is a graphics profiling toolbox for DirectX, Vulkan, and OpenGL, enabling you to collect GPU performance metrics directly from your application. 

Key features:

  • Simplified APIs for HTML report generation.
  • Lower-level range profiling APIs, with utility libraries providing ease-of-use.
  • All of the preceding, usable in D3D11, D3D12, OpenGL, and Vulkan.
  • Samples illustrating use cases in 3D, Compute, and Ray Tracing.

Download >>

NVIDIA Nsight Aftermath SDK is a simple library you integrate into your D3D12 or Vulkan game’s crash reporter to generate GPU “mini-dumps” when a TDR or exception occurs. 

Key features:

  • Support for Windows 11 for Nsight Aftermath available with Nsight Graphics 2021.5.
  • Debug GPU exceptions through a detailed “mini-dump.”​
  • Contains the state of GPU pipeline subunits at exception time​.
  • Captures all active warps and current PCs​.
  • Map warp locations back to original HLSL/GLSL source code​.
  • Use markers to pinpoint the exception location in the API call stream​.
  • Helpful for debugging GPU exceptions during development, while in QA, or from deployed applications.

Download >>


Efficient Model Design for In-App DL Inference: Nsight Deep Learning Designer 2021.2

Nsight Deep Learning Designer, is the first in class, IDE tool for developers who want to incorporate high-performance DL-based features into their applications.

It enables in-depth analysis of end-to-end DL workflows for efficient model design.

This release includes new features:

  • Inference performance profiling with GPU metrics and Tensor Core Utilization.
  • Analyze the model visually with the Channel Inspector.
  • Compatible with PyTorch.
  • Specialized Analysis Operators: Noise and Mix, Affine, Linear Blend, Resize, Selector, and many more.

Learn more details about Nsight DL Designer in this post.

Download >>


Easy Software Development with IDE support: Nsight Visual Studio 2021.3, Nsight Visual Studio Code Edition 2021.1 and Nsight Eclipse

NVIDIA Nsight Visual Studio Edition is an application development environment that brings GPU computing into Microsoft Visual Studio IDE. 

Nsight Visual Studio 2021.3 release provides Windows 11 support for full-fledged GPU kernel debugging and code inspection for bottlenecks, system utilization, and throughput improvements. 

Nsight Visual Studio is included in the CUDA® toolkit, release 11.5, with bug fixes and performance improvements.

Download >>

Nsight Visual Studio Code Edition is an application development environment for heterogeneous platforms that brings CUDA® development for GPUs into Microsoft Visual Studio Code. ​

Nsight Visual Studio Code Edition 2021.1 release includes features like IntelliSense support for smart CUDA code completion, debug CPU and GPU code in single session, remote development for cluster environments, and more.

Download>>

Nsight Eclipse Edition is a full-featured IDE powered by the Eclipse platform that provides an all-in-one integrated environment to edit, build, debug, and profile CUDA-C applications. Some key highlights include the ability to provide seamless CPU and CUDA debugging​, native Eclipse plug-in, ​and docker container support.

Download >>


Getting Started with NVIDIA Nsight DevTools

Nsight Compute: DownloadDocumentationWeb Page, [GTC Session] Understanding CUDA Application Behavior, Performance, and Optimization Just Got Easier with the Latest Developer Tools, [NVIDIA DLI] Optimizing CUDA Machine Learning Codes with Nsight Profiling Tools, [Demo Video] Guided Analysis with Nsight ComputeForum.

Nsight Systems: DownloadDocumentationWeb PageDevNews, [GTC Session] Understanding CUDA Application Behavior, Performance, and Optimization Just Got Easier with the Latest Developer ToolsForum.

Nsight Graphics: DownloadDocumentationWeb PageDevNews, [GTC Session] Leveraging NVIDIA Graphics DevTools for High-performance Ray-tracing Applications, [NVIDIA DLI] Developer Tools Fundamentals for Ray Tracing Using NVIDIA Nsight Graphics and NVIDIA Nsight SystemsForum.

Nsight Perf SDK: Download, Documentation, Web Page, Forum.

Nsight Aftermath: Download, Documentation (included in download package), Web Page, Forum.

Nsight Deep Learning Designer: Download, Documentation, Web Page, DevBlog, [GTC Session] Optimize Neural Networks for Quality and Performance with Nsight DL Designer​​Efficient model design for in-app inferencing with Nsight DL Designer, Forum.

Nsight Visual Studio Edition: Download, Documentation, Web Page, Forum.

Nsight Visual Studio Code Edition: Download, Documentation, Web Page, [GTC Session] It’s Alive! CUDA in Visual Studio Code, Demo video: Nsight Visual Studio Code Edition, Forum.

Nsight Eclipse Edition: Download (part of CUDA toolkit installer), Documentation, Web Page, Forum.


Sign up for the Developer Newsletter to stay informed about new announcements and releases!

Check out Nsight DevTool day sessions and DLI courses at GTC to learn more!

Categories
Misc

TIME Magazine Calls NVIDIA Omniverse One of Year’s 100 Best Inventions

NVIDIA Omniverse, a simulation and design collaboration platform for 3D virtual worlds, is already being evaluated by 700+ companies and 70,000 individuals.

TIME magazine today named NVIDIA Omniverse one of the 100 Best Inventions of 2021, saying the project is “making it easier to create ultra-realistic virtual spaces for…real-world purposes.”

Omniverse — a scalable, multi-GPU, real-time reference development platform for 3D simulation and design collaboration — is being evaluated by more than 700 companies and 70,000 individuals to create virtual worlds and unite teams, their assets, and creative applications in one streamlined interface. 

“Virtual worlds are for more than just gaming—they’re useful for planning infrastructure like roads and buildings, and they can also be used to test autonomous vehicles,” TIME wrote in the story, which hits the newsstands Nov. 22. “The platform combines the real-time ray-tracing technology of the brand’s latest graphics processing units with an array of open-source tools for collaborating live in photorealistic 3-D worlds.”

The list of 100 inventions, which TIME describes as “groundbreaking,” is based on multiple factors including originality, creativity, efficacy, ambition and impact. These projects, says TIME, are changing how we live, work, play and think about what’s possible.

In this week’s GTC keynote, NVIDIA CEO Jensen Huang announced the general availability of NVIDIA Omniverse Enterprise, and showed how companies like Ericsson, BMW and Lockheed Martin are using the platform to create digital twins to simulate 5G networks, build a robotic factory and prevent wildfires. 

He also revealed a host of new features for Omniverse, and powerful new capabilities including Omniverse Avatar for interactive conversational AI assistants and Omniverse Replicator, a powerful synthetic data generation engine for training autonomous vehicles and robots.



At the inaugural NVIDIA Omniverse Developer Day at GTC, developers were introduced to a new way to build, license, and distribute native applications, extensions, connectors, and microservices for the platform — opening new paths to market for millions of developers. To get started with NVIDIA Omniverse, download the free open beta for individuals, or explore Omniverse Enterprise. Omniverse users have access to a wide range of technical resources, tutorials and more with the NVIDIA Developer Program.

Categories
Misc

AWS Brings NVIDIA A10G Tensor Core GPUs to the Cloud with New EC2 G5 Instances

A round conference room with a sphere in the middle.Read about the new EC2 G5 instance that powers remote graphics, visual computing, AI/ML training, and inference workloads on AWS cloud.A round conference room with a sphere in the middle.

Today, AWS announced the general availability of the new Amazon EC2 G5 instances, powered by NVIDIA A10G Tensor Core GPUs. These instances are designed for the most demanding graphics-intensive applications, as well as machine learning inference and training simple to moderately complex machine learning models on the AWS cloud.

The new EC2 G5 instances feature up to eight NVIDIA A10G Tensor Core GPUs that are optimized for advanced visual computing workloads. With support for NVIDIA RTX technology and more RT (ray tracing) cores than any other NVIDIA GPU instance on AWS, it offers up to 3X better graphics performance. Based on NVIDIA Ampere Architecture, G5 instances offer up to 3X higher performance for machine learning inference and 3.3X higher performance for machine learning training, compared to the previous generation Amazon EC2 G4dn instances.

Customers can use the G5 instances to accelerate a broad range of graphics applications like interactive video rendering, video editing, computer-aided design, photorealistic simulations, 3D visualization, and gaming. G5 instances also deliver the best user experience for real-time AI inference performance at scale for use-cases like content and product recommendations, voice assistants, chatbots, and visual search.

Getting the most out of EC2 G5 instances using NVIDIA optimized software

To unlock the breakthrough graphics performance on the new G5 instances, creative and technical professionals can use the NVIDIA RTX Virtual Workstation (vWS) software, available from the AWS Marketplace. Only available from NVIDIA, these NVIDIA RTX vWS advancements include hundreds of certified professional ISV applications, support for all of the leading rendering apps, and optimization with all major gaming content. 

NVIDIA RTX technology delivers exceptional features like ray tracing and AI-denoising.  Creative professionals can achieve photorealistic quality with accurate shadows, reflections, and refractions—creating amazing content faster than ever before. 

NVIDIA RTX vWS also supports Deep Learning Super Sampling (DLSS). This gives designers, engineers, and artists the power of AI for producing the highest visual quality, from anywhere. They can also take advantage of technologies like NVIDIA Iray and NVIDIA OptiX for superior rendering capabilities.

Developers on AWS can use state-of-the-art pretrained AI models, GPU-optimized deep learning frameworks, SDKs, and end-to-end application frameworks from the NGC Catalog on AWS Marketplace soon. In particular, developers can take advantage of NVIDIA TensorRT and NVIDIA Triton Inference Server to optimize inference performance and serve ML models at scale using G5 instances. 

Developers have multiple options to take advantage of NVIDIA-optimized software on AWS. Whether you provision and manage the G5 instances yourself or leverage them in AWS managed services like Amazon Elastic Kubernetes service (EKS) or Amazon Elastic Container Service (ECS).

Learn more about the EC2 G5 instances and get started. >>

Categories
Misc

New Online Course Offers Hands-on Machine Learning Using AWS and NVIDIA

AWS, NVIDIA logosAWS and NVIDIA have collaborated to develop an online course that introduces Amazon SageMaker with EC2 Instances powered by NVIDIA GPUs.AWS, NVIDIA logos

AWS and NVIDIA have collaborated to develop an online course that guides you through a simple-to-follow and practical introduction to Amazon SageMaker with EC2 Instances powered by NVIDIA GPUs. This course is grounded in the practical application of services and gives you the opportunity to learn hands-on from experts in machine learning development. Through a simple and straightforward approach, once completed, you will have the confidence and competency to immediately begin working on your ML project.

Machine learning can be complex, tedious, and time-consuming. AWS and NVIDIA provide the fastest, most effective, and easy-to-use ML tools to get you started on your ML project. Amazon SageMaker helps data scientists and developers prepare, build, train, and deploy high-quality ML models quickly by bringing together a broad set of capabilities purpose-built for ML. Amazon EC2 instances powered by NVIDIA GPUs along with NVIDIA software offer high-performance, GPU-optimized instances in the cloud for efficient model training and cost-effective model inference hosting.

In this course, you will first be given a high-level overview of modern machine learning. Then, we will dive right in and get you up and running with a GPU-powered SageMaker instance. You will learn how to prepare a dataset for training a model, how to build a model, how to execute the training of a model, and how to deploy and optimize a model. You will learn hands-on how to apply this workflow for computer vision (CV) and natural language processing (NLP) use cases.

After completing this course, you will be able to build, train, deploy, and optimize ML workflows with GPU acceleration in Amazon SageMaker and understand the key SageMaker services applicable to tabular, computer vision, and language ML tasks. You will feel empowered and have the confidence and competency to solve complex machine learning problems in a more efficient manner.  By using SageMaker, you will simplify workflows so you can build and deploy ML models quickly, freeing you up to focus on other problems to solve. 

Course Overview

This course is designed for machine learning practitioners, including data scientists and developers, who have a working knowledge of machine learning workflows. In this course, you will gain hands-on experience with Amazon SageMaker and Amazon EC2 instances powered by NVIDIA GPUs. There are four modules in the course:

Module 1 – Introduction to Amazon SageMaker and NVIDIA GPUs

In this module, you will learn about the purpose-built tools available within Amazon SageMaker for modern machine learning. This includes a tour of the Amazon SageMaker Studio IDE that can be used to prepare, build, train and tune, and deploy and manage your own ML models. Then you will learn how to use Amazon SageMaker classic notebooks and Amazon SageMaker Studio notebooks to develop natural language processing (NLP), computer vision (CV), and other ML models using RAPIDS. You will also dive deep into NVIDIA GPUs, the NGC Catalog, and instances available on AWS for ML.

Module 2 – GPU Accelerated Machine Learning Workflows with RAPIDS and Amazon SageMaker

In this module, you will apply your knowledge of NVIDIA GPUs and Amazon SageMaker. You will gain a background in GPU accelerated machine learning and perform the steps required to set up Amazon SageMaker. You will then learn about data acquisition and data transformation, move on to model design and training, and finish up by evaluating hyperparameter optimization, AutoML, and GPU accelerated inferencing.

Module 3 – Computer Vision

In this module, you will learn about the application of deep learning for computer vision (CV). As humans, half of our brains are devoted to visual processing, making it critical to how we perceive the world. Endowing machines with sight has been a challenging endeavor, but advancements in compute, algorithms, and data quality have made computer vision more accessible than ever before. From mobile cameras to industrial mechanic lenses, biological labs to hospital imaging, and self-driving cars to security cameras, data in pixel format is one of the most valuable types of data for consumers and companies. In this module, you will explore common CV applications, and you will learn how to build an end-to-end object detection model on Amazon SageMaker using NVIDIA GPUs.

Module 4 – Natural Language Processing

In this module, you will learn about applying deep learning technologies to the problem of language understanding. What does it mean to understand languages? What is language modeling? What is the BERT language model, and why are such language models used in many popular services like search, office productivity software, and voice agents? Are NVIDIA GPUs a fast and cost-efficient platform to train and deploy NLP Models? In this module, you will find answers to all those questions and more. Whether you are an experienced ML engineer considering implementation or a developer wanting to learn to deploy a language understanding model like BERT quickly, this module is for you.

Conclusion

AWS and NVIDIA provide fast, effective, easy-to-use ML tools to get you started on working on your ML project. Learn more about the course to guide you through your ML journey!

Categories
Misc

Looking for help on understanding the notation of filenames like "Mobilenet_V1_1.0_224_quant.tflite"

I’m looking at running some models from https://www.tensorflow.org/lite/guide/hosted_models

The model filename is something like `Mobilenet_V1_1.0_224_quant.tflite`

I understand that 224 is the input size but I’m not sure what the 1.0 represents. It would be useful if someone can tell me what the 1.0 means. Feel free to link some docs that would give me insight if you find that easier 🙂

Thanks in advance, really appreciate it.

submitted by /u/sonjpaul
[visit reddit] [comments]

Categories
Misc

were can i find inputs and output of a model

hi guys lately I find a TensorFlow Lite model and I wanna use it on my android app (the link of the model below) and I didn’t find input and the outputs

and if I wondering if there a way to see the inputs and outputs type

https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md

submitted by /u/Left_Complaint_6668
[visit reddit] [comments]

Categories
Offsites

Train in R, run on Android: Image segmentation with torch

We train a model for image segmentation in R, using torch together with luz, its high-level interface. We then JIT-trace the model on example input, so as to obtain an optimized representation that can run with no R installed. Finally, we show the model being run on Android.

Categories
Offsites

Simple Portfolio Optimization That Works!

Categories
Offsites

Newton’s Fractal (which Newton knew nothing about)