NVIDIA GTC is live and bustling, bringing together the world’s most brilliant and creative minds who shape our world with the power of AI, computer graphics and more. At the show, we announced new features for NVIDIA Omniverse, our real-time digital-twin simulation and collaboration platform for 3D workflows. These include Omniverse VR, Remote and Showroom, Read article >
Researchers tackling the challenge of visual misinformation — think the TikTok video of Tom Cruise supposedly golfing in Italy during the pandemic — must continuously advance their tools to identify AI-generated images. NVIDIA is furthering this effort by collaborating with researchers to support the development and testing of detector algorithms on our state-of-the-art image-generation models. Read article >
This week’s GFN Thursday packs a prehistoric punch with the release of Jurassic World Evolution 2. It also gets infinitely brighter with the release of Bright Memory: Infinite. Both games feature NVIDIA RTX technologies and are part of the six titles joining the GeForce NOW library this week. GeForce NOW RTX 3080 members will get Read article >
For millions of professionals around the world, 3D workflows are essential. Everything they build, from cars to products to buildings, must first be designed or simulated in a virtual world. At the same time, more organizations are tackling complex designs while adjusting to a hybrid work environment. As a result, design teams need a solution Read article >
The earth is warming. The past seven years are on track to be the seven warmest on record. The emissions of greenhouse gases from human activities are responsible for approximately 1.1°C of average warming since the period 1850-1900. What we’re experiencing is very different from the global average. We experience extreme weather — historic droughts, Read article >
The appetite for AI and data science is increasing, and nowhere is that more prevalent than in emerging markets. Registrations for this week’s GTC from African nations tripled compared with the spring edition of the event. Indeed, Nigeria had the third most registered attendees for countries in the EMEA region, ahead of France, Italy and Read article >
Introducing the first GPU+DPU in a single package.
The modern data center is becoming increasingly difficult to manage. There are billions of possible connection paths between applications and petabytes of log data. Static rules are insufficient to enforce security policies for dynamic microservices, and the sheer magnitude of log data is impossible for any human to analyze.
AI provides the only path to the secure and self-managed data center of the future.
The NVIDIA converged accelerator is the world’s first AI-enhanced DPU. It combines the computational power of GPUs with the network acceleration and security benefits of DPUs, creating a single platform for AI-enhanced data center management. Converged accelerators can apply AI-generated rules to every packet in the data center network, creating new possibilities for real-time security and management.
Figure 1. In standard mode, the BlueField-2 DPU and GPU are connected by a dedicated PCIe Gen4 switch for full bandwidth outside of the host PCIe system.
At NVIDIA GTC, we are introducing two new converged accelerators. The A100X combines an A100 Tensor Core GPU with a NVIDIA BlueField-2 data processing unit on a single module. The A30X combines an A30 Tensor Core GPU with the same BlueField-2 DPU. The converged cards have the unique ability to extend the BlueField-2 capabilities of offloading, isolating, and accelerating the network to include AI inference and training.
Both accelerators feature an integrated PCIe switch between the DPU and GPU. The integrated switch eliminates contention for host resources, enabling line-rate GPUDirect RDMA performance. The integrated switch also improves security by isolating data movement between the GPU and NIC.
AI enhanced DPU
The converged accelerators support two modes of operation:
Standard–The BlueField-2 DPU and the GPU operate separately.
BlueField-X–The PCI switch is reconfigured so the GPU is dedicated to the DPU and no longer visible to the host system.
In BlueField-X mode, the GPU is dedicated exclusively to the operating system running on the DPU. BlueField-X mode creates a new class of accelerator never before seen in the industry: a GPU-accelerated DPU.
Figure 2. In BlueField-X mode the x86 host only sees the BlueField-2 DPU, allowing the DPU to run AI workloads on the network data.
In BlueField-X mode, the GPU can run AI models on the data flowing through the DPU as a “bump in the wire.” There is no performance overhead and no compromise on security. The AI model is fully accelerated without consuming host resources.
BlueField-X unlocks novel use cases for cybersecurity, data center management, and I/O acceleration. For example, the Morpheus Cybersecurity framework uses machine learning to take action on security threats that were previously impossible to identify. Morpheus uses DPUs to harvest telemetry from every server in the data center and send it to GPU-equipped servers for analysis.
With BlueField-X, the AI models can run locally on the converged accelerator in each server. This allows Morpheus to analyze more data, faster, while simultaneously eliminating costly data movement and reducing the attack surface for malicious actors. Malware detection, data exfiltration prevention, and dynamic firewall rule creation are Morpheus use cases enabled by BlueField-X.
The Morpheus example only scratches the surface of what is possible with BlueField-X. Our customers routinely share ideas that we had not yet considered. To enable more creative exploration of AI-enabled networks, we are introducing the NVIDIA Converged Accelerator Developer Kit.
With this developer kit, we provide early access to A30X accelerators for select customers and partners building the next generation of accelerated AI network applications. Discover new applications for BlueField-X in edge computing or data center management. Some ideas to help get you started include:
Transparent video preprocessing–Bump-in-the-wire video preprocessing (decryption, interlacing, format conversion, etc) to improve IVA throughput and camera density.
Small-cell RU solution–RAN signal processing on a converged accelerator to increase subscriber density and throughput on a commodity gNodeB server.
Computational storage–Bump-in-the-wire storage encryption, indexing, and hashing to offload costly CPU cycles from a storage host preparing data for long-term storage.
Cheating detection–Detect malicious gameplay/cheating in a streaming gaming service
Get started with the NVIDIA Converged Accelerator Developer Kit
The NVIDIA Converged Accelerator Developer Kit contains sample applications that combine CUDA and NVIDIA DOCA, and documentation to help you install, configure your new converged accelerator. Most importantly, we provide access to an A30X and usage support in exchange for feedback.
To get started, simply register your interest on the NVIDIA Converged Accelerator Developer Kit webpage. If approved, we will contact you once the hardware is ready to ship and you can start building the next generation of accelerated applications.
We hope that you share our excitement for building a new class of real-time AI applications for data center management and edge computing. Let the discovery begin.
Nsight Systems helps you tune and scale software across CPUs and GPUs.
The latest update to NVIDIA Nsight Systems—a performance analysis tool—is now available for download. Designed to help you tune and scale software across CPUs and GPUs, this release introduces several improvements aimed to enhance the profiling experience.
Nsight Systems is part of the powerful debugging and profiling NVIDIA Nsight Tools Suite. You can start with Nsight Systems for an overall system view and avoid picking less efficient optimizations based on assumptions and false-positive indicators.
2021.5 highlights
Statistics now available in user interface
Multi-report view with horizontal and vertical layouts to aid investigations across server nodes, VMs, containers, ranks, and processes (coming soon)
Expert system now includes GPU utilization analysis for OpenGL and DX12
NVIDIA NIC Infiniband metrics sampling (experimental)
DirectX12 memory operations and warnings
DXGI/DX12/Vulkan API calls correlation to WDDM queue packets
Windows 11 support
Multireport view enhancements (coming soon) can improve investigations. They support merging into a single timeline reports that are continuations of existing sessions or reports captured simultaneously from other server nodes, VMs, container, rank, and process.
Figure 1. Two MPI ranks from separate report files viewed together on a shared timeline
NVIDIA NIC Infiniband metrics sampling (experimental) enables you to understand details of server communications, such as throughput, packet counts, and congestion notifications.
Figure 2. NVIDIA NIC Infiniband metrics sampling
Using DirectX12 trace, a new memory operations row highlights memory usage warnings and situations where expensive functions are called when resources are non-persistently mapped.
Figure 3. DirectX12 memory operations and warnings
WDDM trace now correlated graphics API calls to queue packets so that you can better understand workload creation and its progress through the Windows display driver model.
Figure 4. DXGI, DX12, and Vulkan API call correlation to WDDM queue packets
For more information, see the following resources:
Learn how building models with NVIDIA Data Science Workbench can improve management and increase productivity.
Data scientists wrestle with many challenges that slow development. There are operational tasks, including software stack management, installation, and updates that impact productivity. Reproducing state-of-the-art assets can be difficult as modern workflows include many tedious and complex tasks. Access to the tools you need is not always fast or convenient. Also, the use of multiple tools and CLIs adds complexity to the data science lifecycle.
Master your Data Science environment
Building data science models is easier said than done. That’s why we are announcing NVIDIA Data Science Workbench to simplify and orchestrate tasks for data scientists, data engineers, and AI developers. Using a GPU-enabled mobile or desktop workstation, users can easily manage the software development environment for greater productivity and ease-of-use while quickly reproducing state-of-the-art examples to accelerate development. Through Workbench, key assets are just a click away.
Figure 1: NVIDIA Data Science Workbench and Enterprise Data Science Stack.
Workbench enhances the development experience in several ways:
Better management
Easily set up your work environment and manage NVIDIA Data Science Stack software versions. Access tools that provide optimized frameworks for GPU accelerated performance as well as automatic driver, CUDA, nv-docker, and NGC container updates. Also, get notified about other important updates.
Easy reproducibility
Build quality models faster based on state-of-the-art example code. Dockerize GitHub content and reproduce assets for your Jupyter environment. Use NGC the container for GPU-optimized code that also runs in AWS.
Greater productivity
Easy software and driver installation, quickly access the Jupyter notebook, software assets, Kaggle notebooks, GitHub, and more. Use NGC containers for GPU-optimized code that also runs in AWS.
Try Workbench
The released version for Ubuntu 18.04 and 20.04 is now available. Click here for installation instructions. Also, watch this 90-second Workbench demo:
Video 1: The video shows Workbench as a desktop application and illustrates NGC, Kaggle, and various data science tools and assets are easily accessed.
“I installed the NVIDIA Data Science Workbench and quickly discovered that it was easy to reproduce Git content and download NGC containers for use in Jupyter. I was pleasantly surprised to learn that Workbench installs a data science software environment for you as well as addressing updates – which is usually a big hassle and a big consumption of time. I’d expect Workbench will become a popular tool for anyone building deep-learning models and other data science projects.”
Dr. Chanin Nantasenamat Associate Professor of Bioinformatics at Mahidol University Founder of Data Professor YouTube Channel
Attend our session at the NVIDIA GTC Conference to learn more about Workbench. GTC registration is required (registration is free).
Date and Time: November 11, 2021 at 3:00am – 3:50am Pacific Time (on-demand afterward)
Workbench – your personal assistant
NVIDIA Data Science Workbench can make you more productive by providing a convenient framework on workstations for building models that use best practices. Workbench will run on most GPU-enabled workstations, but NVIDIA-Certified Workstations are recommended. In the end, it’s easier to manage, reproduce, and leverage NGC, Kaggle, and Conda for helpful assets.
Workbench won’t build your code for you, but it will accelerate development, reduce confusion, and help deliver better quality models in less time. To learn more, read the Workbench webpage, or visit the Workbench forum.
NVIDIA announced the latest release in Nsight Graphics, which supports Direct3D, Vulkan, OpenGL, OpenVR, and the Oculus SDK.
Today, NVIDIA announced Nsight Graphics 2021.5, the latest release, which supports Direct3D (11, 12, and DXR), Vulkan (1.2, NV Vulkan Ray Tracing Extension), OpenGL, OpenVR, and the Oculus SDK. Nsight Graphics is a standalone developer tool that enables you to debug, profile, and export frames built with high-fidelity 3D graphics applications.
Key features:
Full Windows 11 support for both API Capture and Tracing.
Acceleration Structure Viewer with Bounding Volume Overlap Analysis.
Users are able to specify the continual refresh flag via the Nsight API.
Support for Linux NGX.
Developers now have full support for Windows 11 for all activities, including Frame Profiling and GPU Trace profiling. The Acceleration Structure Overlap Analyzer is a noteworthy addition to this release as it helps to ensure that NVIDIA RTX ray tracing applications are efficiently traversing bounding volumes in your scene. This has direct implications on performance, making it an important feature for anyone looking to optimize their ray tracing applications.