Categories
Misc

Managing Data Centers Securely and Intelligently with NVIDIA UFM Cyber-AI

The NVIDIA UFM Cyber-AI platform helps to minimize downtime in InfiniBand data centers by harnessing AI-powered analytics to detect security threats and operational issues, as well as predict network failures. This post outlines the advanced features that system administrators can use to quickly detect and respond to potential security threats and upcoming failures, saving costs and ensuring consistent customer service.

Today’s data centers host many users and a wide variety of applications. They have even become the key element of competitive advantage for research, technology, and global industries. With the increased complexity of scientific computing, data center operational costs also continue to rise. In addition to the operational disruption of security threats, keeping a data center intact and running smoothly is critical.

What’s more, malicious users may exploit data center access to misuse compute resources by running prohibited applications resulting in unexpected downtimes and higher operating costs.  More than ever, data center management tools that quickly identify issues while improving efficiency are a priority for today’s IT managers and the developers who support them.

NVIDIA may be best known for stunning graphics capabilities and unmatched GPU compute performance used in nearly every area of research. However, for many years, it has also been the leader in secure and scalable data center technologies, including flexible libraries and tools to maximize world-class infrastructures.

NVIDIA recognizes that providing a full-stack solution for what might be the most critical component of today’s research and business includes more than world-class server platforms, GPUs, and the broadest software portfolio deployed throughout the data center. NVIDIA also knows that security and manageability are key pillars on which datacenter infrastructure is built.

NVIDIA UFM Cyber-AI revolutionizes the InfiniBand data center

The NVIDIA Unified Fabric Manager (UFM) Cyber-AI platform offers enhanced and real-time network telemetry, combined with AI-powered intelligence and advanced analytics. It enables IT managers to discover operational anomalies and even predict network failures. This improves both security and data center uptime while decreasing overall operating expenses.

The unique advantage of UFM Cyber-AI is its ability to capture rich telemetry information and employ AI techniques to identify hidden correlations between events. This enables it to detect abnormal system and application behavior, and even identify performance degradations before they lead to component or system failure. UFM Cyber-AI can even take corrective actionsin real time. The platform learns the typical operational modes of the data center and detects abnormal use based on network telemetry data, including traffic patterns, temperature, and more.

Fundamentals of UFM Cyber-AI

UFM Cyber-AI contains three different layers, as shown in Figure 1.

UFM Cyber-AI contains three layers: Input telemetry, processing models, and output dashboard.
Figure 1. UFM Cyber-AI layers
  • Input telemetry: Collects information and learns from the network in various ways:
    • Telemetry of all elements in the network
    • Network topology (connectivity and resource allocation for tenants or applications)
    • Features and capabilities of network equipment
  • Processing models: Contains several models, such as an extraction, transformation, and loading (ETL) processing engine for data preparation. It also contains aggregation, data storage, and analytical models for comparison. UFM Cyber-AI uses machine learning (ML) techniques and AI models for anomaly detection and prediction to learn the lifecycle patterns of data center network components (cable, switch, port, InfiniBand adapter).
  • Output dashboard:  A visualization layer that exposes a central dashboard for network administrators and cloud orchestrators to see alerts and recommendations for improving network utilization and efficiency, and solving network health issues. The dashboard offers two main categories: Suspicious Behavior and Link Analysis, each including sections for alerts and predictions (Figure 2).
Dashboard shows the Suspicious Behavior and Link Analysis categories, with example alerts and predictions.
Figure 2. UFM Cyber-AI Prediction dashboard

A feature-rich, intuitive, and customizable fabric manager

UFM Cyber-AI also supports customizable network alerts or viewing triggered anomalies over time and in different time dimensions. By using aggregated network statistics based on hour or day-of-the-week parameters, you can set thresholds and configure notifications based on measurements that might deviate from typical operational use. For example, you could use predefined thresholds to identify problematic cables.

Built-in analytics compares current telemetry information against time-based aggregated information to detect any suspicious increase or decrease in use or traffic patterns and immediately notify the system administrator. UFM Cyber-AI also provides data center tenant or application alerts through link or port telemetry information to identify low-level partition key (PKEY) associated statistics along with their associated nodes.

Only UFM Cyber-AI offers features like link failure prediction, whichsupports predictive maintenance. By detecting performance degradation cases in the early stages, UFM Cyber-AI can predict potential link or port failures. This enables administrators to perform maintenance and eliminate data center downtime.

Future enhancements with NVIDIA Morpheus

Bringing the most robust fabric management solution for InfiniBand requires constant innovation to keep pace with the complexities of managing today’s complex data center. We plan to integrate NVIDIA Morpheus with UFM Cyber-AI (Figure 3), bringing more telemetry information from other data center elements, such as server or rack-based component-based telemetry or DPU, GPU, and application counters.

We could even provide an additional layer that can interface directly with other APIs such as Kafka, an open-source distributed event streaming platform used for high-performance data pipelines, streaming analytics, and data integration. You could use that integration for specific detection of developer-defined operational system exceptions, such as crypto-mining detection on a system dedicated for life-science research.

Diagram shows how UFM Cyber-AI integrates with Morpheus to provide enhanced network traffic visibility for improved security.
Figure 3. Integration example of UFM Cyber-AI with the Morpheus framework

Morpheus is an open AI application framework that provides cybersecurity developers with a highly optimized AI pipeline and pretrained AI capabilities. These capabilities enable you to inspect all network traffic instantaneously across your data center fabric. Morpheus brings a new level of security to data centers by providing the following:

  • Dynamic protection
  • Real-time telemetry
  • Adaptive policies
  • Cyber defenses for detecting and remediating cybersecurity threats
Diagram shows potential interfaces to standardized APIs, such as Kafka, RAPIDS, and PyTorch.
Figure 4. Example for UFM Cyber-AI as a flexible and extendable platform

As Morpheus integrates into the UFM Cyber-AI appliance, we can offer the best and most complete solution that is also flexible and extendable for mission-critical data centers and supporting developers. With customizable anomaly detection and interfaces to other standardized APIs, UFM Cyber-AI is a flexible asset for any data center or cloud-native infrastructure supporting multitenancy.

For more information, see NVIDIA Unified Fabric Manager.

Categories
Misc

I have a trained model, how can I save the model as .pb file so it can be used on other platforms, and what does it have to do with freeze graph?

submitted by /u/Shaker007
[visit reddit] [comments]

Categories
Misc

Unexpected performance from RTX 3070Ti, possibly algo limiter?

I have GTX 1080 and RTX 3070Ti and for reason they show about the same performance on my model, about 60 seconds per epoch of 3k samples.

The model is 27 layers (as per summary), mostly conv1d, has about 270k parameters and input sample is about 450 values.

I wonder why is this so? GTX is way older chip and RTX is supposed to have faster tensor cores, 1080 has 2560 cores vs 6144 cores and faster memory on 3070Ti. It’s supposed to be at least 2 times faster before cores efficiency.

Is this some kind of hardware limiter like the one against mining?

I am using TF 2.5 and latest cudnn/cuda 11/studio drivers.

Also I wonder if 3080Ti/3090 would show similar performance since they are not A models.

submitted by /u/dmter
[visit reddit] [comments]

Categories
Misc

Different predictions for the same image

So I have been working on a project, the model was trained on the screenshots that were taken on a laptop. When Images that are captured on a phone are fed to the model for testing the predictions are wrong, but when the same images are taken as screenshot[on a laptop] and fed to the model the predictions are correct. What is the issue here?

submitted by /u/fad-maggot
[visit reddit] [comments]

Categories
Misc

Reducing the size of a TensorFlow model file

I have an TensorFlow model which basically classifies stuff into X-rays and Not X-rays.

My problem is that the TensorFlow model file(for that model) is a whopping 123mb big, and I need to reduce the size of it somehow.

I saw an answer on Stack Overflow which had the exact code for what I needed to do, but only in TensorFlow 1.0.

So if anyone here could give me the updated TensorFlow 2.0 code, I would appreciate it a lot.

Python Version : Python 3.8.7 TensorFlow on my Laptop: TensorFlow 2.4 TensorFlow on the server I am hosting the app on : TensorFlow-CPU 2.2.0

submitted by /u/banana_who_can_type
[visit reddit] [comments]

Categories
Misc

Detect hips api. Python.

I‘m a photographer and I have to crop a lot of pictures. I thought, I‘ll ask you guys. So I would need to identify the hips in order to crop the images. Tops in this situation. I have a bit of knowledge in Python. Is there an API or I have to install everything? Thank you.

submitted by /u/one_of_us31
[visit reddit] [comments]

Categories
Misc

TensorFlow Introduces ‘PluggableDevice’ Architecture To Integrate Accelerators (GPUs, TPUs) With TensorFlow Without Making Any Changes In The TensorFlow Code

TensorFlow introduces the PluggableDevice architecture, which seamlessly integrates accelerators (GPUs, TPUs) with TensorFlow without making any changes in the TensorFlow code. PluggableDevice, as the name suggests, provides plug-in options for registering the devices with TensorFlow. It is constructed using the StreamExecutor C API and builds on the work done for Modular TensorFlow. In TF 2.5, the PluggableDevice feature is available.

Full Story: https://www.marktechpost.com/2021/06/24/tensorflow-introduces-pluggabledevice-architecture-to-integrate-accelerators-gpus-tpus-with-tensorflow-without-making-any-changes-in-the-tensorflow-code/

submitted by /u/ai-lover
[visit reddit] [comments]

Categories
Offsites

Achieving Precision in Quantum Material Simulations

In fall of 2019, we demonstrated that the Sycamore quantum processor could outperform the most powerful classical computers when applied to a tailor-made problem. The next challenge is to extend this result to solve practical problems in materials science, chemistry and physics. But going beyond the capabilities of classical computers for these problems is challenging and will require new insights to achieve state-of-the-art accuracy. Generally, the difficulty in performing quantum simulations of such physical problems is rooted in the wave nature of quantum particles, where deviations in the initial setup, interference from the environment, or small errors in the calculations can lead to large deviations in the computational result.

In two upcoming publications, we outline a blueprint for achieving record levels of precision for the task of simulating quantum materials. In the first work, we consider one-dimensional systems, like thin wires, and demonstrate how to accurately compute electronic properties, such as current and conductance. In the second work, we show how to map the Fermi-Hubbard model, which describes interacting electrons, to a quantum processor in order to simulate important physical properties. These works take a significant step towards realizing our long-term goal of simulating more complex systems with practical applications, like batteries and pharmaceuticals.

A bottom view of one of the quantum dilution refrigerators during maintenance. During the operation, the microwave wires that are floating in this image are connected to the quantum processor, e.g., the Sycamore chip, bringing the temperature of the lowest stage to a few tens of milli-degrees above absolute zero temperature.

Computing Electronic Properties of Quantum Materials
In “Accurately computing electronic properties of a quantum ring”, to be published in Nature, we show how to reconstruct key electronic properties of quantum materials. The focus of this work is on one-dimensional conductors, which we simulate by forming a loop out of 18 qubits on the Sycamore processor in order to mimic a very narrow wire. We illustrate the underlying physics through a series of simple text-book experiments, starting with a computation of the “band-structure” of this wire, which describes the relationship between the energy and momentum of electrons in the metal. Understanding such structure is a key step in computing electronic properties such as current and conductance. Despite being an 18-qubit algorithm consisting of over 1,400 logical operations, a significant computational task for near-term devices, we are able to achieve a total error as low as 1%.

The key insight enabling this level of accuracy stems from robust properties of the Fourier transform. The quantum signal that we measure oscillates in time with a small number of frequencies. Taking a Fourier transform of this signal reveals peaks at the oscillation frequencies (in this case, the energy of electrons in the wire). While experimental imperfections affect the height of the observed peaks (corresponding to the strength of the oscillation), the center frequencies are robust to these errors. On the other hand, the center frequencies are especially sensitive to the physical properties of the wire that we hope to study (e.g., revealing small disorders in the local electric field felt by the electrons). The essence of our work is that studying quantum signals in the Fourier domain enables robust protection against experimental errors while providing a sensitive probe of the underlying quantum system.

(Left) Schematic of the 54-qubit quantum processor, Sycamore. Qubits are shown as gray crosses and tunable couplers as blue squares. Eighteen of the qubits are isolated to form a ring. (Middle) Fourier transform of the measured quantum signal. Peaks in the Fourier spectrum correspond to the energy of electrons in the ring. Each peak can be associated with a traveling wave that has fixed momentum. (Right) The center frequency of each peak (corresponding to the energy of electrons in the wire) is plotted versus the peak index (corresponding to the momentum). The measured relationship between energy and momentum is referred to as the ‘band structure’ of the quantum wire and provides valuable information about electronic properties of the material, such as current and conductance.

Quantum Simulation of the Fermi-Hubbard Model
In “Observation of separated dynamics of charge and spin in the Fermi-Hubbard model”, we focus on the dynamics of interacting electrons. Interactions between particles give rise to novel phenomena such as high temperature superconductivity and spin-charge separation. The simplest model that captures this behavior is known as the Fermi-Hubbard model. In materials such as metals, the atomic nuclei form a crystalline lattice and electrons hop from lattice site to lattice site carrying electrical current. In order to accurately model these systems, it is necessary to include the repulsion that electrons feel when getting close to one another. The Fermi-Hubbard model captures this physics with two simple parameters that describe the hopping rate (J) and the repulsion strength (U).

We realize the dynamics of this model by mapping the two physical parameters to logical operations on the qubits of the processor. Using these operations, we simulate a state of the electrons where both the electron charge and spin densities are peaked near the center of the qubit array. As the system evolves, the charge and spin densities spread at different rates due to the strong correlations between electrons. Our results provide an intuitive picture of interacting electrons and serve as a benchmark for simulating quantum materials with superconducting qubits.

(Left top) Illustration of the one-dimensional Fermi-Hubbard model in a periodic potential. Electrons are shown in blue, with their spin indicated by the connected arrow. J, the distance between troughs in the electric potential field, reflects the “hopping” rate, i.e., the rate at which electrons transition from one trough in the potential to another, and U, the amplitude, represents the strength of repulsion between electrons. (Left bottom) The simulation of the model on a qubit ladder, where each qubit (square) represents a fermionic state with spin-up or spin-down (arrows). (Right) Time evolution of the model reveals separated spreading rates of charge and spin. Points and solid lines represent experimental and numerical exact results, respectively. At t = 0, the charge and spin densities are peaked at the middle sites. At later times, the charge density spreads and reaches the boundaries faster than the spin density.

Conclusion
Quantum processors hold the promise to solve computationally hard tasks beyond the capability of classical approaches. However, in order for these engineered platforms to be considered as serious contenders, they must offer computational accuracy beyond the current state-of-the-art classical methods. In our first experiment, we demonstrate an unprecedented level of accuracy in simulating simple materials, and in our second experiment, we show how to embed realistic models of interacting electrons into a quantum processor. It is our hope that these experimental results help progress the goal of moving beyond the classical computing horizon.

Categories
Misc

Buckle Up for the Industrial HPC Revolution

“A confluence of advances has put us at the beginnings of the industrial HPC revolution,” said Jensen Huang. In a short talk viewable below, NVIDIA’s CEO described to a gathering of high performance computing specialists in Europe the genesis and outlook for the most powerful technology trend of our lifetimes. High performance computing is experiencing Read article >

The post Buckle Up for the Industrial HPC Revolution appeared first on The Official NVIDIA Blog.

Categories
Misc

Innovation, Inclusion, Impact: Highlights from Our Annual Corporate Social Responsibility Report

NVIDIA’s 12th annual corporate social responsibility report, published today, shares our progress — and plans — to take care of employees, reach new sustainability goals and channel our tech to support the global community. Following a year of global hardship — the ongoing battle against the COVID-19 pandemic, social unrest and renewed calls for racial Read article >

The post Innovation, Inclusion, Impact: Highlights from Our Annual Corporate Social Responsibility Report appeared first on The Official NVIDIA Blog.