Categories
Misc

Bringing Networking into View with the NVIDIA Air Marketplace

NVIDIA Air now includes the NVIDIA Air Marketplace—a collection of demos to get started building your network digital twin.

Networking simulations are essential since the classical model of deployment, based on CLI and adventurous copy/paste-based configuration, has become inefficient for medium– and large-scale environments. NVIDIA Air provides a platform to build, simulate, and experience a modern data center powered by a modern network operating system (NOS). 

What is NVIDIA Air?

NVIDIA Air is a cloud-based environment, which runs in your browser and is powered in its backend by NVIDIA Cumulus Linux, SONiC, and Linux (that is, a standard server Linux). This approach to networking simulations shows the paradigm shift from traditional networking to the new area of native cloud.  

Air is designed to remove the need for the hypervisor, which is frequently a bottleneck in terms of resources and a time-consuming constraint, for fast feature testing. Air addresses many scenarios:

  • Demo infrastructure (Sonic in the Cloud, Cumulus in the Cloud, Cumulus and Sonic in the Cloud) 
  • Continuous integration 
  • Custom topologies, with the builder 
  • Training and education 
  • Configuration management 

Air provides an always-accessible, always-on training or preproduction environment for networking teams. Enterprises can now shrink their hardware footprint and decrease expenses; lower CapEx due to reduced hardware needs; and lower OpEx using the Air public cloud operational model. With Air, modern cloud-scale networking has never been easier or more powerful.   

NVIDIA Air Marketplace

Recently, we launched the NVIDIA Air Marketplace—a collection of on-demand training, test resources, and demos for Cumulus Linux and other NVIDIA Networking offerings. This collection enhances the simplicity and lowered the barrier of entry to Air. The marketplace consists of content directly created by NVIDIA and by one of the best communities ever: you!  

A display of all the currently available demos in the NVIDIA Air Marketplace.
Figure 1. NVIDIA Air Demo Marketplace

How to get started

The marketplace is here to help those curious or new to Cumulus Linux. The curated demo environments make it easy to test new functionalities and environments in a simple and straightforward way. You can access a complete demo lab that has the same characteristics of a physical environment. Each demo lab also includes a validated demo guide to help you with the lab.  

First, you must access the air.nvidia.com portal using your username and password, or by creating a new account.  After you have entered the platform, on the left sidebar, choose Demo Marketplace. From here, you can view a catalog of prebuilt scenarios that allow you to create a lab about the specific feature or configuration you would like to test. 

Choose the scenario that piques your interest. From here, you can read the README, explore the git repository, or start the demo with a single click.

A screenshot showing the popup README and a
Figure 3.  Starting a demo

Air then allocates the resources required. Thanks to the low Cumulus footprint of 768 MB, it takes roughly 90 seconds to spin up 15+ nodes.

When the lab is loaded, you can log in to the mgmt-server from your browser or with your favorite SSH client. 

A screenshot of the loaded demo where you can navigate the command line through Air or with your favorite SSH client.
Figure 5. Guided tours

For example, from lterm2:

> ssh -p 16732 cumulus@worker01.air.nvidia.com
Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 4.15.0-151-generic x86_64)

* Documentation: https://help.ubuntu.com
* Management:    https://landscape.canonical.com
* Support:       https://ubuntu.com/advantage

System information as of Tue Oct 19 13:03:58 UTC 2021

System load:  0.07             Processes:           114
Usage of /:   29.2% of 9.29GB  Users logged in:     0
Memory usage: 23%              IP address for eth0: 169.254.0.2
Swap usage:   0%               IP address for eth1: 192.168.200.1

25 updates can be applied immediately.
16 of these updates are standard security updates.
To see these additional updates run: apt list --upgradable

New release '20.04.3 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

Last login: Tue Oct 19 13:03:46 2021 from fd01:1:1:32c5::1
cumulus@oob-mgmt-server :~$ 

Create your own demo

Do you have an idea for a demo for a specific use case? This is the perfect time to become an active part of the community. You can create your own demo environment and submit it for review by contacting your NVIDIA sales representative or an NVIDIA team member. The Air team will review and publish the demos to the marketplace.

With the vibrant NVIDIA Air community, the sky’s the limit for training and collaboration in the marketplace!  

Categories
Misc

‘AI 2041: Ten Visions for Our Future’: AI Pioneer Kai-Fu Lee Discusses His New Work of Fiction

One of AI’s greatest champions has turned to fiction to answer the question: how will technology shape our world in the next 20 years? Kai-Fu Lee, CEO of Sinovation Ventures and a former president of Google China, spoke with NVIDIA AI Podcast host Noah Kravitz about AI 2041: Ten Visions for Our Future. The book, Read article >

The post ‘AI 2041: Ten Visions for Our Future’: AI Pioneer Kai-Fu Lee Discusses His New Work of Fiction appeared first on The Official NVIDIA Blog.

Categories
Misc

NVIDIA DPU Hackathon Unveils AI, Cloud, and Accelerated Computing Breakthroughs

Two hackathon participants typing on their computerNVIDIA announces the winners from the second global DPU Hackathon.
Two hackathon participants typing on their computer

The second global NVIDIA DPU Hackathon brought together 11 teams with the goal of creating new and exciting data processing unit (DPU) innovations. Spanning 24 hours from December 8 to 9, the second in a series of global NVIDIA DPU Hackathons received over 50 team applications from various universities and enterprises. 

As a new class of programmable processors, a DPU ignites unprecedented innovation for modern data centers. By offloading, accelerating, and isolating a broad range of advanced networking, storage, and security services, NVIDIA BlueField DPUs provide a secure and accelerated infrastructure for any workload in any environment. The NVIDIA DOCA software framework brings together APIs, drivers, libraries, sample code, documentation, services, and prepackaged containers so developers can speed application development and deployment on BlueField DPUs. They span several use cases, including security, automation, AI, HPC, and telemetry.

“We love hackathons, they create the right environment to perform a step function in the development. We put the DOCA developers in the center, offering them training, mentorship, preconfigured setups, documentation, a working environment, and visibility. Moving forward hackathons will play a significant role in establishing a strong DOCA developer community,”  said Dror Goldenberg, the SVP of Software Architecture at NVIDIA.

Two of the hackathon participants collaborating on their team application.
Figure 1. Hackathon participants sitting at their laptops.

DPU Hackathon winners

First Place – Team Rutgers University

Team Rutgers University focused on developing a unique, high-performance, DPU accelerated and scalable L4 Load Balancer. Using DOCA FLOW APIs to configure the embedded switch, Team Rutgers was able to build an application that delivers hardware acceleration for offloading the load-balancing algorithm and handle flow tracking in hardware. The final design is a testament to the unique value that can be achieved with DOCA.  

Second Place – Team Equinix Metal

Team Equinix Metal focused on innovative development on DPU service orchestration with gRPC APIs. They were clearly excited to try the DPU in a bare-metal cloud use case, to improve their existing synchronous method. By using gRPC to configure the bare-metal host network in an asynchronous way, they ensured networking commands were handled even if the network was disrupted. This enabled them to use gRPC commands to the DPU to allow asynchronous configuration with OVS running on the BlueField and deliver the BlueField service orchestration.

Third Place – Team BlueJazz from Versa Networks

Team BlueJazz created a DPU accelerated and secure access service by running traffic inspection and inference service on the DPU with 100G links. With their innovation, they offload any datapath with subfunctions and accelerate processing by virtualizing the DPU as an engine for security. Team BlueJazz used DOCA Deep Packet Inspection APIs to offload the pattern-matching logic and leveraged DOCA reference applications for URL filtering and application recognition.

Congratulations to our winners and thank you to all of the teams that participated, making this round of our NVIDIA DPU Hackathon a success!

Join the DOCA community

NVIDIA is building a broad community of DOCA developers to create innovative applications and services on top of BlueField DPUs to secure and accelerate modern, efficient data centers. To learn more about joining the community, visit the DOCA developer web page or register to download DOCA today.

Up next is the NVIDIA DPU Hackathon in China. Check the corporate calendar to stay informed for future events, and take part in our journey to reshape the data center of tomorrow. 

Resources

Categories
Misc

are there any difference between Mediapipe and MoveNet?

Hi everyone,

I’m a little confused after going through the documentation for these two as I don’t understand the difference between these two libraries. They both even use 17 points to find the person’s position. Is there any difference between these two libraries?

I’m going to be using this in a mobile app to take a picture when a user hits a specific pose, and as you can tell, I’m new to all of this.

Thank you for any help and guidance in advance.

submitted by /u/A_Tired_Founder
[visit reddit] [comments]

Categories
Misc

Advent of Code 2021 in pure TensorFlow – day 3. TensorArrays limitations, and tf.function relaxed shapes

Advent of Code 2021 in pure TensorFlow - day 3. TensorArrays limitations, and tf.function relaxed shapes submitted by /u/pgaleone
[visit reddit] [comments]
Categories
Misc

Scaling Zero Touch RoCE Technology with Round Trip Time Congestion Control

Zero Touch RoCE enables a smooth data highwayThe new NVIDIA RTTCC congestion control algorithm for ZTR delivers RoCE performance at scale, without special switch infrastructure configuration. Zero Touch RoCE enables a smooth data highway

NVIDIA Zero Touch RoCE (ZTR) enables data centers to seamlessly deploy RDMA over Converged Ethernet (RoCE) without requiring any special switch configuration. Until recently, ZTR was optimal for only small to medium-sized data centers. Meanwhile, large-scale deployments have traditionally relied on Explicit Congestion Notification (ECN) to enable RoCE network transport, which requires switch configuration.

The new NVIDIA congestion control algorithm—Round-Trip Time Congestion Control (RTTCC)—allows ZTR to scale to thousands of servers without compromising performance. Using ZTR and RTTCC allows data center operators to enjoy ease-of-deployment and operations together with the superb performance of Remote Direct Memory Access (RDMA) at a massive scale, without any switch configuration. 

This post describes the previously recommended RoCE congestion control in large and small-scale RoCE deployments. It then introduces a new congestion control algorithm that allows configuration-free, large-scale implementations of ZTR, which perform like ECN-enabled RoCE. 

RoCE deployments with Data Center Quantized Congestion Notification

In a typical TCP-based environment, distributed memory requests require many steps and CPU cycles, negatively impacting application performance.  RDMA eliminates all CPU involvement in memory data transfers between servers significantly accelerating both access to stored data and application performance. 

RoCE provides RDMA in Ethernet environments—the primary network fabric in data centers. Ethernet requires an advanced congestion control mechanism to support RDMA network transports. Data Center Quantized Congestion Notification (DCQCN) is a congestion control algorithm that enables responding to congestion notifications and dynamically adjusting traffic transmit rates. 

The implementation of DCQCN requires enabling Explicit Congestion Notification (ECN), which entails configuring network switches. ECN configures switches to set the Congestion Experienced (CE) bit to indicate the imminent onset of congestion. 

Zero touch RoCE—with reactive congestion control 

The NVIDIA-developed ZTR technology allows RoCE deployments, which don’t require configuring the switch infrastructure. Built according to the InfiniBand Trade Association (IBTA) RDMA standard and fully compliant with the RoCE specifications, ZTR enables seamless deployment of RoCE. ZTR also boasts performance equivalent to traditional switch-enabled RoCE and is significantly better than traditional TCP-based memory access. Moreover, with ZTR, RoCE network transport services operate side-by-side with non-RoCE communications in ordinary TCP/IP environments.

As noted in the NVIDIA Zero-Touch RoCE Technology Enables Cloud Economics for Microsoft Azure Stack HCI post, Microsoft has validated ZTR for their Azure Stack HCI platform, which typically scales to a few dozen nodes. In such environments, ZTR relies on implicit packet loss notification, which is sufficient for small-scale deployments. Adding a new Round Trip Timer (RTT)-based congestion control algorithm, ZTR becomes even more robust and scalable without relying on packet loss to notify the server of network congestion.

Introducing round-trip time congestion control

The new NVIDIA congestion control algorithm, RTTCC, actively monitors network RTT to proactively detect and adapt to the onset of congestion before dropping packets. RTTCC enables dynamic congestion control using a hardware-based feedback loop that provides dramatically superior performance compared to software-based congestion control algorithms. RTTCC also supports faster transmission rates and can deploy ZTR at a larger scale. ZTR with RTTCC is now available as a beta feature, with GA planned for the second half of 2022.

How ZTR-RTTCC works

ZTR-RTTCC extends DCQCN in RoCE networks with a hardware RTT-based congestion control algorithm.

Server A (the initiator) sends both payload and timing packets to server B. Timing packets are immediately returned to the initiator, enabling it to measure the round-trip latency.
Figure 1. Round trip timing between servers

Timing packets (green network packets in the preceding figure) are periodically sent from the initiator to the target. The timing packets are immediately returned, enabling measurement of round-trip latency. RTTCC measures the time interval between when the packet was sent and when the initiator received it. The difference (Time Received – Time Sent) measures round-trip latency which indicates path congestion. Uncongested flows continue to transmit packets to utilize the available network path bandwidth best. Flows showing increasing latency imply path congestion, for which RTTCC throttles traffic to avoid buffer overflow and packet drops.

Network traffic can be adjusted either up or down in real-time as congestion decreases or increases. The ability to actively monitor and react to congestion is critical to enabling ZTR to manage congestion proactively. This proactive rate control also results in reduced packet re-transmission and improved RoCE performance. With ZTR-RTTCC, data center nodes do not wait to be notified of packet loss; instead, they actively identify congestion prior to packet loss and react accordingly, notifying initiators to adjust transmission rates.

As noted earlier, one of the key benefits of ZTR is the ability to provide RoCE functionality while operating simultaneously with non-RoCE communications in ordinary TCP/IP traffic. ZTR provides seamless deployment of RoCE network capabilities. With the addition of RTTCC actively monitoring congestion, ZTR provides data center-wide operation without switch configuration. Read on to see how it performs.

ZTR with RTTCC performance

As shown in Figure 2, ZTR with RTTCC provides application performance comparable to RoCE when ECN and PFC are configured across the network fabric. These tests were performed under worst case many-to-one (in cast) scenarios to simulate the throughput under congested conditions. 

The results indicate that not only does ZTR with RTTCC scale to thousands of nodes, but it also performs comparably to the fastest RoCE solution currently available.

  • At small scale (256 connections and below), ZTR with RTTCC performs within 99% of RoCE with ECN congestion control enabled (conventional RoCE).
  • With over 16,000 connections, ZTR with RTTCC throughput is 98% of conventional RoCE throughput.

ZTR with RTTCC provides near-equivalent performance to conventional RoCE without requiring any switch configuration.

A diagram showing comparison of network throughput (Gb/s) for ZTR w/ RTTCC and RoCE w/ DC-QCN (Conventional RoCE)
Figure 2. Application bandwidth with increasing connections

Configuring ZTR

To configure ZTR with the new RTTCC algorithm, download and install the latest firmware and tools for your NVIDIA network interface card and perform the following steps.

Enable programmable congestion control using mlxconfig (persistent configuration):

mlxconfig -d /dev/mst/mt4125_pciconf0 -y s
ROCE_CC_LEGACY_DCQCN=0

Reset the device using mlxfwreset or reboot the host:

mlxfwreset -d /dev/mst/mt4125_pciconf0 -l 3 -y r

When you complete these steps, ZTR-RTTCC is used when RDMA-CM is used with Enhanced Connection Establishment (ECE, supported with MLNX_OFED version 5.1). 

If there’s an error, you can force ZTR-RTTCC usage regardless of RDMA-CM synchronization status:

mlxreg -d /dev/mst/mt4125_pciconf0 --reg_id 0x506e --reg_len
0x40 --set "0x0.0:8=2,0x4.0:4=15" -y

Summary

NVIDIA RTTCC, the new congestion control algorithm for ZTR, delivers superb RoCE performance at data center scale, without any special configuration of the switch infrastructure. This enhancement allows data centers to enable RoCE seamlessly in both existing and new data center infrastructure and benefit from immediate application performance improvements. 

We encourage you to test ZTR with RTTCC for your application use cases by downloading the latest NVIDIA software.

Categories
Misc

NVIDIA Awards $50,000 Fellowships to Ph.D. Students for GPU Computing Research

For more than two decades, NVIDIA has supported graduate students doing GPU-based work through the NVIDIA Graduate Fellowship Program. Today we’re announcing the latest awards of up to $50,000 each to 10 Ph.D. students involved in GPU computing research. Selected from a highly competitive applicant pool, the awardees will participate in a summer internship preceding Read article >

The post NVIDIA Awards $50,000 Fellowships to Ph.D. Students for GPU Computing Research appeared first on The Official NVIDIA Blog.

Categories
Misc

What Is a Digital Twin?

A digital twin is a continuously updated virtual representation — a true-to-reality simulation of physics and materials — of a real-world physical asset or system.

The post What Is a Digital Twin? appeared first on The Official NVIDIA Blog.

Categories
Misc

Startup Surge: Utility Feels Power of Computer Vision to Track its Lines

It was the kind of message Connor McCluskey loves to find in his inbox. As a member of the product innovation team at FirstEnergy Corp. — an electric utility serving 6 million customers from central Ohio to the New Jersey coast — his job is to find technologies that open new revenue streams or cut Read article >

The post Startup Surge: Utility Feels Power of Computer Vision to Track its Lines  appeared first on The Official NVIDIA Blog.

Categories
Misc

Why can’t I find `ndim` in the API docs for tf.Tensor?

I’m following a tutorial that user `ndim`, for example:

scalar = tf.constant(7) scalar.ndim 

However, I can’t find `ndim` in the attribute section of the API docs for `tf.Tensor`

Where should I be looking for this?

submitted by /u/snowch_uk
[visit reddit] [comments]