Categories
Misc

Optimize Ray Tracing with NVIDIA Nsight Graphics 2021.5 Featuring Windows 11 Support

Inside a purple lit, subway car.NVIDIA announced the latest release in Nsight Graphics, which supports Direct3D, Vulkan, OpenGL, OpenVR, and the Oculus SDK. Inside a purple lit, subway car.

Today, NVIDIA announced Nsight Graphics 2021.5, the latest release, which supports Direct3D (11, 12, and DXR), Vulkan (1.2, NV Vulkan Ray Tracing Extension), OpenGL, OpenVR, and the Oculus SDK. Nsight Graphics is a standalone developer tool that enables you to debug, profile, and export frames built with high-fidelity 3D graphics applications. 

Key features:

  • Full Windows 11 support for both API Capture and Tracing.
  • Acceleration Structure Viewer with Bounding Volume Overlap Analysis.
  • Users are able to specify the continual refresh flag via the Nsight API.
  • Support for Linux NGX.

Developers now have full support for Windows 11 for all activities, including Frame Profiling and GPU Trace profiling. The Acceleration Structure Overlap Analyzer is a noteworthy addition to this release as it helps to ensure that NVIDIA RTX ray tracing applications are efficiently traversing bounding volumes in your scene. This has direct implications on performance, making it an important feature for anyone looking to optimize their ray tracing applications.

View additional features and details on the Nsight Graphics feature page. >>

Framework of Acceleration Structure Overlap Analyzer.
Figure 1. Screenshot showing the Acceleration Structure Overlap Analyze dialog and colorization for areas where overlaps are occurring.

Resources:

Categories
Misc

Accelerated Model Building with NVIDIA Data Science Workbench

Learn how building models with NVIDIA Data Science Workbench can improve management and increase productivity.

Data scientists wrestle with many challenges that slow development. There are operational tasks, including software stack management, installation, and updates that impact productivity. Reproducing state-of-the-art assets can be difficult as modern workflows include many tedious and complex tasks. Access to the tools you need is not always fast or convenient. Also, the use of multiple tools and CLIs adds complexity to the data science lifecycle.

Master your Data Science environment

Building data science models is easier said than done. That’s why we are announcing NVIDIA Data Science Workbench to simplify and orchestrate tasks for data scientists, data engineers, and AI developers. Using a GPU-enabled mobile or desktop workstation, users can easily manage the software development environment for greater productivity and ease-of-use while quickly reproducing state-of-the-art examples to accelerate development. Through Workbench, key assets are just a click away.

NVIDIA Data Science Workbench connects to the NVIDIA GPU Cloud and is also connected to JupyterLab, PyTorch, TensorFlow, RAPIDS, and multiple CLIs including Kaggle, NGC, NVIDIA Data Science Stack, and AWS.The foundation of the diagram is the NVIDIA-Certified Data Science Workstation and OS which supports Drivers, CUDA, nv-docker, and Docker]
Figure 1: NVIDIA Data Science Workbench and Enterprise Data Science Stack.

Workbench enhances the development experience in several ways: 

Better management

Easily set up your work environment and manage NVIDIA Data Science Stack software versions. Access tools that provide optimized frameworks for GPU accelerated performance as well as automatic driver, CUDA, nv-docker, and NGC™ container updates. Also, get notified about other important updates. 

Easy reproducibility

Build quality models faster based on state-of-the-art example code. Dockerize GitHub content and reproduce assets for your Jupyter environment. Use NGC™ the container for GPU-optimized code that also runs in AWS.

Greater productivity

Easy software and driver installation, quickly access the Jupyter notebook, software assets, Kaggle notebooks, GitHub, and more. Use NGC containers for GPU-optimized code that also runs in AWS.

Try Workbench

The released version for Ubuntu 18.04 and 20.04 is now available. Click here for installation instructions. Also, watch this 90-second Workbench demo:



Video 1: The video shows Workbench as a desktop application and illustrates NGC, Kaggle, and various data science tools and assets are easily accessed.

“I installed the NVIDIA Data Science Workbench and quickly discovered that it was easy to reproduce Git content and download NGC containers for use in Jupyter. I was pleasantly surprised to learn that Workbench installs a data science software environment for you as well as addressing updates – which is usually a big hassle and a big consumption of time. I’d expect Workbench will become a popular tool for anyone building deep-learning models and other data science projects.”

Dr. Chanin Nantasenamat
Associate Professor of Bioinformatics at Mahidol University
Founder of Data Professor YouTube Channel

Attend our session at the NVIDIA GTC Conference to learn more about Workbench. GTC registration is required (registration is free).

Session ID and Title: A31396–Three Ways NVIDIA Improves the Data Science Experience

Date and Time: November 11, 2021 at 3:00am – 3:50am Pacific Time (on-demand afterward)

Workbench – your personal assistant

NVIDIA Data Science Workbench can make you more productive by providing a convenient framework on workstations for building models that use best practices. Workbench will run on most GPU-enabled workstations, but NVIDIA-Certified Workstations are recommended. In the end, it’s easier to manage, reproduce, and leverage NGC, Kaggle, and Conda for helpful assets.

Workbench won’t build your code for you, but it will accelerate development, reduce confusion, and help deliver better quality models in less time. To learn more, read the Workbench webpage, or visit the Workbench forum.

Categories
Misc

Announcing NVIDIA Nsight Systems 2021.5

Nsight Systems helps you tune and scale software across CPUs and GPUs.

The latest update to NVIDIA Nsight Systems—a performance analysis tool—is now available for download. Designed to help you tune and scale software across CPUs and GPUs, this release introduces several improvements aimed to enhance the profiling experience.

Nsight Systems is part of the powerful debugging and profiling NVIDIA Nsight Tools Suite. You can start with Nsight Systems for an overall system view and avoid picking less efficient optimizations based on assumptions and false-positive indicators.

2021.5 highlights

  • Statistics now available in user interface
  • Multi-report view with horizontal and vertical layouts to aid investigations across server nodes, VMs, containers, ranks, and processes (coming soon)
  • Expert system now includes GPU utilization analysis for OpenGL and DX12
  • NVIDIA NIC Infiniband metrics sampling (experimental)
  • DirectX12 memory operations and warnings
  • DXGI/DX12/Vulkan API calls correlation to WDDM queue packets
  • Windows 11 support

Multireport view enhancements (coming soon) can improve investigations. They support merging into a single timeline reports that are continuations of existing sessions or reports captured simultaneously from other server nodes, VMs, container, rank, and process.

Screenshot of Nsight Systems GUI displaying two separate reports tiled together in a single view.
Figure 1. Two MPI ranks from separate report files viewed together on a shared timeline

NVIDIA NIC Infiniband metrics sampling (experimental) enables you to understand details of server communications, such as throughput, packet counts, and congestion notifications.

Screenshot of Nsight Systems GUI timeline displaying NVIDIA Infiniband NIC metrics.
Figure 2. NVIDIA NIC Infiniband metrics sampling

Using DirectX12 trace, a new memory operations row highlights memory usage warnings and situations where expensive functions are called when resources are non-persistently mapped.

Screenshot of Nsight Systems GUI timeline of DirectX12 operations on a timeline showcasing memory operations and warnings.
Figure 3. DirectX12 memory operations and warnings

WDDM trace now correlated graphics API calls to queue packets so that you can better understand workload creation and its progress through the Windows display driver model.

Screenshot of Nsight Systems GUI displaying a timeline of graphics API calls in correlation to WDDM queue packets.
Figure 4. DXGI, DX12, and Vulkan API call correlation to WDDM queue packets

For more information, see the following resources:

Categories
Misc

Accelerating Data Center AI with the NVIDIA Converged Accelerator Developer Kit

Introducing the first GPU+DPU in a single package.

The modern data center is becoming increasingly difficult to manage. There are billions of possible connection paths between applications and petabytes of log data. Static rules are insufficient to enforce security policies for dynamic microservices, and the sheer magnitude of log data is impossible for any human to analyze.

AI provides the only path to the secure and self-managed data center of the future.

The NVIDIA converged accelerator is the world’s first AI-enhanced DPU. It combines the computational power of GPUs with the network acceleration and security benefits of DPUs, creating a single platform for AI-enhanced data center management. Converged accelerators can apply AI-generated rules to every packet in the data center network, creating new possibilities for real-time security and management.

Image shows NVIDIA's new converged accelerator which combines a Bluefield2 DPU and Ampere GPU.
Figure 1. In standard mode, the BlueField-2 DPU and GPU are connected by a dedicated PCIe Gen4 switch for full bandwidth outside of the host PCIe system.

At NVIDIA GTC, we are introducing two new converged accelerators. The A100X combines an A100 Tensor Core GPU with a NVIDIA BlueField-2 data processing unit on a single module. The A30X combines an A30 Tensor Core GPU with the same BlueField-2 DPU. The converged cards have the unique ability to extend the BlueField-2 capabilities of offloading, isolating, and accelerating the network to include AI inference and training. 

Both accelerators feature an integrated PCIe switch between the DPU and GPU. The integrated switch eliminates contention for host resources, enabling line-rate GPUDirect RDMA performance. The integrated switch also improves security by isolating data movement between the GPU and NIC.

AI enhanced DPU

The converged accelerators support two modes of operation:

  • Standard–The BlueField-2 DPU and the GPU operate separately.
  • BlueField-X–The PCI switch is reconfigured so the GPU is dedicated to the DPU and no longer visible to the host system.

In BlueField-X mode, the GPU is dedicated exclusively to the operating system running on the DPU. BlueField-X mode creates a new class of accelerator never before seen in the industry: a GPU-accelerated DPU.

Image shows that in Bluefield-X mode, the CPU in the host server connects to the Converged Accelerator. The Converged Accelerator's PCIe switch is connected to the CPU and DPU. While the GPU is only connected to the PCIe switch and DPU.
Figure 2. In BlueField-X mode the x86 host only sees the BlueField-2 DPU, allowing the DPU to run AI workloads on the network data.

In BlueField-X mode, the GPU can run AI models on the data flowing through the DPU as a “bump in the wire.” There is no performance overhead and no compromise on security.  The AI model is fully accelerated without consuming host resources. 

BlueField-X unlocks novel use cases for cybersecurity, data center management, and I/O acceleration. For example, the Morpheus Cybersecurity framework uses machine learning to take action on security threats that were previously impossible to identify. Morpheus uses DPUs to harvest telemetry from every server in the data center and send it to GPU-equipped servers for analysis.

With BlueField-X, the AI models can run locally on the converged accelerator in each server. This allows Morpheus to analyze more data, faster, while simultaneously eliminating costly data movement and reducing the attack surface for malicious actors. Malware detection, data exfiltration prevention, and dynamic firewall rule creation are Morpheus use cases enabled by BlueField-X.

The Morpheus example only scratches the surface of what is possible with BlueField-X. Our customers routinely share ideas that we had not yet considered. To enable more creative exploration of AI-enabled networks, we are introducing the NVIDIA Converged Accelerator Developer Kit

With this developer kit, we provide early access to A30X accelerators for select customers and partners building the next generation of accelerated AI network applications. Discover new applications for BlueField-X in edge computing or data center management. Some ideas to help get you started include:

  • Transparent video preprocessing–Bump-in-the-wire video preprocessing (decryption, interlacing, format conversion, etc) to improve IVA throughput and camera density.
  • Small-cell RU solution–RAN signal processing on a converged accelerator to increase subscriber density and throughput on a commodity gNodeB server.
  • Computational storage–Bump-in-the-wire storage encryption, indexing, and hashing to offload costly CPU cycles from a storage host preparing data for long-term storage.
  • Cheating detection–Detect malicious gameplay/cheating in a streaming gaming service

Get started with the NVIDIA Converged Accelerator Developer Kit

The NVIDIA Converged Accelerator Developer Kit contains sample applications that combine CUDA and NVIDIA DOCA, and documentation to help you install, configure your new converged accelerator. Most importantly, we provide access to an A30X and usage support in exchange for feedback. 

To get started, simply register your interest on the NVIDIA Converged Accelerator Developer Kit webpage. If approved, we will contact you once the hardware is ready to ship and you can start building the next generation of accelerated applications.

We hope that you share our excitement for building a new class of real-time AI applications for data center management and edge computing. Let the discovery begin.

Categories
Misc

NVIDIA GTC Sees Spike in Developers From Africa

The appetite for AI and data science is increasing, and nowhere is that more prevalent than in emerging markets. Registrations for this week’s GTC from African nations tripled compared with the spring edition of the event. Indeed, Nigeria had the third most registered attendees for countries in the EMEA region, ahead of France, Italy and Read article >

The post NVIDIA GTC Sees Spike in Developers From Africa appeared first on The Official NVIDIA Blog.

Categories
Misc

New Online Course Offers Hands-on Machine Learning Using AWS and NVIDIA

AWS, NVIDIA logosAWS and NVIDIA have collaborated to develop an online course that introduces Amazon SageMaker with EC2 Instances powered by NVIDIA GPUs.AWS, NVIDIA logos

AWS and NVIDIA have collaborated to develop an online course that guides you through a simple-to-follow and practical introduction to Amazon SageMaker with EC2 Instances powered by NVIDIA GPUs. This course is grounded in the practical application of services and gives you the opportunity to learn hands-on from experts in machine learning development. Through a simple and straightforward approach, once completed, you will have the confidence and competency to immediately begin working on your ML project.

Machine learning can be complex, tedious, and time-consuming. AWS and NVIDIA provide the fastest, most effective, and easy-to-use ML tools to get you started on your ML project. Amazon SageMaker helps data scientists and developers prepare, build, train, and deploy high-quality ML models quickly by bringing together a broad set of capabilities purpose-built for ML. Amazon EC2 instances powered by NVIDIA GPUs along with NVIDIA software offer high-performance, GPU-optimized instances in the cloud for efficient model training and cost-effective model inference hosting.

In this course, you will first be given a high-level overview of modern machine learning. Then, we will dive right in and get you up and running with a GPU-powered SageMaker instance. You will learn how to prepare a dataset for training a model, how to build a model, how to execute the training of a model, and how to deploy and optimize a model. You will learn hands-on how to apply this workflow for computer vision (CV) and natural language processing (NLP) use cases.

After completing this course, you will be able to build, train, deploy, and optimize ML workflows with GPU acceleration in Amazon SageMaker and understand the key SageMaker services applicable to tabular, computer vision, and language ML tasks. You will feel empowered and have the confidence and competency to solve complex machine learning problems in a more efficient manner.  By using SageMaker, you will simplify workflows so you can build and deploy ML models quickly, freeing you up to focus on other problems to solve. 

Course Overview

This course is designed for machine learning practitioners, including data scientists and developers, who have a working knowledge of machine learning workflows. In this course, you will gain hands-on experience with Amazon SageMaker and Amazon EC2 instances powered by NVIDIA GPUs. There are four modules in the course:

Module 1 – Introduction to Amazon SageMaker and NVIDIA GPUs

In this module, you will learn about the purpose-built tools available within Amazon SageMaker for modern machine learning. This includes a tour of the Amazon SageMaker Studio IDE that can be used to prepare, build, train and tune, and deploy and manage your own ML models. Then you will learn how to use Amazon SageMaker classic notebooks and Amazon SageMaker Studio notebooks to develop natural language processing (NLP), computer vision (CV), and other ML models using RAPIDS. You will also dive deep into NVIDIA GPUs, the NGC Catalog, and instances available on AWS for ML.

Module 2 – GPU Accelerated Machine Learning Workflows with RAPIDS and Amazon SageMaker

In this module, you will apply your knowledge of NVIDIA GPUs and Amazon SageMaker. You will gain a background in GPU accelerated machine learning and perform the steps required to set up Amazon SageMaker. You will then learn about data acquisition and data transformation, move on to model design and training, and finish up by evaluating hyperparameter optimization, AutoML, and GPU accelerated inferencing.

Module 3 – Computer Vision

In this module, you will learn about the application of deep learning for computer vision (CV). As humans, half of our brains are devoted to visual processing, making it critical to how we perceive the world. Endowing machines with sight has been a challenging endeavor, but advancements in compute, algorithms, and data quality have made computer vision more accessible than ever before. From mobile cameras to industrial mechanic lenses, biological labs to hospital imaging, and self-driving cars to security cameras, data in pixel format is one of the most valuable types of data for consumers and companies. In this module, you will explore common CV applications, and you will learn how to build an end-to-end object detection model on Amazon SageMaker using NVIDIA GPUs.

Module 4 – Natural Language Processing

In this module, you will learn about applying deep learning technologies to the problem of language understanding. What does it mean to understand languages? What is language modeling? What is the BERT language model, and why are such language models used in many popular services like search, office productivity software, and voice agents? Are NVIDIA GPUs a fast and cost-efficient platform to train and deploy NLP Models? In this module, you will find answers to all those questions and more. Whether you are an experienced ML engineer considering implementation or a developer wanting to learn to deploy a language understanding model like BERT quickly, this module is for you.

Conclusion

AWS and NVIDIA provide fast, effective, easy-to-use ML tools to get you started on working on your ML project. Learn more about the course to guide you through your ML journey!

Categories
Misc

AWS Brings NVIDIA A10G Tensor Core GPUs to the Cloud with New EC2 G5 Instances

A round conference room with a sphere in the middle.Read about the new EC2 G5 instance that powers remote graphics, visual computing, AI/ML training, and inference workloads on AWS cloud.A round conference room with a sphere in the middle.

Today, AWS announced the general availability of the new Amazon EC2 G5 instances, powered by NVIDIA A10G Tensor Core GPUs. These instances are designed for the most demanding graphics-intensive applications, as well as machine learning inference and training simple to moderately complex machine learning models on the AWS cloud.

The new EC2 G5 instances feature up to eight NVIDIA A10G Tensor Core GPUs that are optimized for advanced visual computing workloads. With support for NVIDIA RTX technology and more RT (ray tracing) cores than any other NVIDIA GPU instance on AWS, it offers up to 3X better graphics performance. Based on NVIDIA Ampere Architecture, G5 instances offer up to 3X higher performance for machine learning inference and 3.3X higher performance for machine learning training, compared to the previous generation Amazon EC2 G4dn instances.

Customers can use the G5 instances to accelerate a broad range of graphics applications like interactive video rendering, video editing, computer-aided design, photorealistic simulations, 3D visualization, and gaming. G5 instances also deliver the best user experience for real-time AI inference performance at scale for use-cases like content and product recommendations, voice assistants, chatbots, and visual search.

Getting the most out of EC2 G5 instances using NVIDIA optimized software

To unlock the breakthrough graphics performance on the new G5 instances, creative and technical professionals can use the NVIDIA RTX Virtual Workstation (vWS) software, available from the AWS Marketplace. Only available from NVIDIA, these NVIDIA RTX vWS advancements include hundreds of certified professional ISV applications, support for all of the leading rendering apps, and optimization with all major gaming content. 

NVIDIA RTX technology delivers exceptional features like ray tracing and AI-denoising.  Creative professionals can achieve photorealistic quality with accurate shadows, reflections, and refractions—creating amazing content faster than ever before. 

NVIDIA RTX vWS also supports Deep Learning Super Sampling (DLSS). This gives designers, engineers, and artists the power of AI for producing the highest visual quality, from anywhere. They can also take advantage of technologies like NVIDIA Iray and NVIDIA OptiX for superior rendering capabilities.

Developers on AWS can use state-of-the-art pretrained AI models, GPU-optimized deep learning frameworks, SDKs, and end-to-end application frameworks from the NGC Catalog on AWS Marketplace soon. In particular, developers can take advantage of NVIDIA TensorRT and NVIDIA Triton Inference Server to optimize inference performance and serve ML models at scale using G5 instances. 

Developers have multiple options to take advantage of NVIDIA-optimized software on AWS. Whether you provision and manage the G5 instances yourself or leverage them in AWS managed services like Amazon Elastic Kubernetes service (EKS) or Amazon Elastic Container Service (ECS).

Learn more about the EC2 G5 instances and get started. >>

Categories
Misc

TIME Magazine Calls NVIDIA Omniverse One of Year’s 100 Best Inventions

NVIDIA Omniverse, a simulation and design collaboration platform for 3D virtual worlds, is already being evaluated by 700+ companies and 70,000 individuals.

TIME magazine today named NVIDIA Omniverse one of the 100 Best Inventions of 2021, saying the project is “making it easier to create ultra-realistic virtual spaces for…real-world purposes.”

Omniverse — a scalable, multi-GPU, real-time reference development platform for 3D simulation and design collaboration — is being evaluated by more than 700 companies and 70,000 individuals to create virtual worlds and unite teams, their assets, and creative applications in one streamlined interface. 

“Virtual worlds are for more than just gaming—they’re useful for planning infrastructure like roads and buildings, and they can also be used to test autonomous vehicles,” TIME wrote in the story, which hits the newsstands Nov. 22. “The platform combines the real-time ray-tracing technology of the brand’s latest graphics processing units with an array of open-source tools for collaborating live in photorealistic 3-D worlds.”

The list of 100 inventions, which TIME describes as “groundbreaking,” is based on multiple factors including originality, creativity, efficacy, ambition and impact. These projects, says TIME, are changing how we live, work, play and think about what’s possible.

In this week’s GTC keynote, NVIDIA CEO Jensen Huang announced the general availability of NVIDIA Omniverse Enterprise, and showed how companies like Ericsson, BMW and Lockheed Martin are using the platform to create digital twins to simulate 5G networks, build a robotic factory and prevent wildfires. 

He also revealed a host of new features for Omniverse, and powerful new capabilities including Omniverse Avatar for interactive conversational AI assistants and Omniverse Replicator, a powerful synthetic data generation engine for training autonomous vehicles and robots.



At the inaugural NVIDIA Omniverse Developer Day at GTC, developers were introduced to a new way to build, license, and distribute native applications, extensions, connectors, and microservices for the platform — opening new paths to market for millions of developers. To get started with NVIDIA Omniverse, download the free open beta for individuals, or explore Omniverse Enterprise. Omniverse users have access to a wide range of technical resources, tutorials and more with the NVIDIA Developer Program.

Categories
Misc

were can i find inputs and output of a model

hi guys lately I find a TensorFlow Lite model and I wanna use it on my android app (the link of the model below) and I didn’t find input and the outputs

and if I wondering if there a way to see the inputs and outputs type

https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md

submitted by /u/Left_Complaint_6668
[visit reddit] [comments]

Categories
Misc

Looking for help on understanding the notation of filenames like "Mobilenet_V1_1.0_224_quant.tflite"

I’m looking at running some models from https://www.tensorflow.org/lite/guide/hosted_models

The model filename is something like `Mobilenet_V1_1.0_224_quant.tflite`

I understand that 224 is the input size but I’m not sure what the 1.0 represents. It would be useful if someone can tell me what the 1.0 means. Feel free to link some docs that would give me insight if you find that easier 🙂

Thanks in advance, really appreciate it.

submitted by /u/sonjpaul
[visit reddit] [comments]