Categories
Misc

Improve Guidance and Performance Visualization with the New Nsight Compute

CUDA-X logo graphicLearn more about new features and ways to improve system performance using Nsight Compute 2022.2 CUDA-X logo graphic

NVIDIA Nsight Compute is an interactive kernel profiler for CUDA applications. It provides detailed performance metrics and API debugging through a user interface and a command-line tool. Nsight Compute 2022.2 includes features to expand the supported environments and workflows for CUDA kernel profiling and optimization. 

Download now. >>

The following outlines the feature highlights of Nsight Compute 2022.2.

NVIDIA OptiX acceleration structure viewer

With the new NVIDIA OptiX acceleration structure viewer, users can inspect the structures they build before launching a ray-tracing pipeline. Acceleration structures describe a rendered scene’s geometries for ray-tracing intersection calculations. Users create these acceleration structures and OptiX translates them to internal data structures. Sometimes the description created by the user is error prone and it can be difficult to understand why the rendered result is not as expected or what is limiting performance. 

With this new feature, users can navigate through them in a 3D visualizer and view the parameters used during their creation like build flags, triangle mesh vertices, and AABB coordinates. This viewer is useful to identify overlaps or inefficient hierarchies, resulting in subpar ray-tracing performance.

Nsight Compute Acceleration Structure Viewer provides 3D Scene Navigation and metrics]
Figure 1. Nsight Compute acceleration structure viewer with 3D scene navigation

Issues detection per kernel

The latest version adds a new “Issues Detected” column to the summary page for users to sort all profiled kernels by the number of performance issues detected. This gives users guidance on where to focus their efforts across multiple results (kernel profiles). If users are unsure which kernel to focus their optimization efforts on, a long running kernel with a high number of detected issues is a good starting point.

The Issues Detected Column in the Summary Page identifies kernels with the most performance issues
Figure 2. Issues detected column in summary page identifies kernels with the most performance issues

Additional improvements

There are improvements to the metric grouping and selection options on the source page to make them easier to use. Additionally, this release adds support for running the Nsight Compute user interface on ARM SBSA and L4T based platforms, for users to profile without needing remote connections or separate host machines for the user interface.

Check out the sessions below released at NVIDIA GTC 2022 showcasing Nsight tool capabilities, support with Jetson Orin, and more.

Nsight Compute Resources

Categories
Misc

Real-Time AI Model Aims to Help Protect the Great Barrier Reef

Google worked with Australia’s national science agency to train ML models that monitor and map harmful coral-eating crown-of-thorns starfish outbreaks along the Great Barrier Reef.

Marine biologists have a new AI tool for monitoring and protecting coral reefs. The project—a collaboration between Google and Australia’s Commonwealth Scientific and Industrial Research Organization (CSIRO)—employs computer vision detection models to pinpoint damaging outbreaks of crown-of-thorns starfish (COTS) through a live camera feed. Keeping a closer eye on reefs helps scientists address growing populations quickly, to protect the valuable Great Barrier Reef ecosystem.

Despite covering less than 1% of the vast ocean floor, coral reefs support about 25% of sea species including fish, invertebrates, and marine mammals. When healthy, these productive marine environments provide commercial and subsistence fishing and income for tourism and recreational businesses. They also protect coastal communities during storm surges and are a rich source of antiviral compounds for drug discovery research.

Assemblages of COTS are found throughout the Indo-Pacific region and feed on coral polyps—the living part of hard coral reefs. They typically occur in low numbers, posing little harm to the ecosystem. However, as outbreaks increase in frequency—in part due to nutrient run-off and a decline in natural predators—they are causing significant damage.

Healthy reefs take about 10 to 20 years to recover from COTS outbreaks, defined by 30 or more adults per 10,000 square meters, or when densities consume coral faster than it can grow. Degraded reefs facing environmental stressors such as climate change, pollution, and destructive fishing practices are less likely to recover, resulting in irreversible damage, diminished coral cover, and biodiversity loss.

Scientists control outbreaks through interventions. Two common approaches involve injecting a starfish with bile salts or removing populations from the water. But, traditional reef surveying, which consists of towing a snorkeler behind a boat for visual identification, is time-consuming, labor-intensive, and less accurate. 

According to the project’s TensorFlow post, “CSIRO developed an edge ML platform (built on top of the NVIDIA Jetson AGX Xavier) that can analyze underwater image sequences and map out detections in near real time.” The authors, Megha Malpani an AI/ML product manager at Google, and Ard Oerlemans a Google software engineer, are part of a team of researchers working with CSIRO to build the most accurate and performant models.

Video 1. Learn about how Google teamed up with CSIRO to create an ML model that helps monitor harmful species on the Great Barrier Reef

Employing an annotated dataset from CSIRO the researchers developed an accurate object detection model that uses a live camera feed rather than a snorkeler to detect the starfish. 

It processes images at more than 10 frames per second with precision across a variety of ocean conditions such as lighting, visibility, depth, viewpoint, coral habitat, and the number of COTS present.

According to the post, when a COTS starfish is detected, it is assigned a unique ID tracker, linking detections over time and video frame. “We link detections in subsequent frames to each other by first using optical flow to predict where the starfish will be in the next frame, and then matching detections to predictions based on their Intersection over Union (IoU) score,” Malpani and Oerlemans write. 

With the ultimate goal of quickly determining the total number of COTS, the team focused on the entire pipeline’s accuracy. The “current 1080p model using TensorFlow TensorRT runs at 11 FPS on the Jetson AGX Xavier, reaching a sequence-based F2 score of 0.80! We additionally trained a 720p model that runs at 22 FPS on the Jetson module, with a sequence-based F2 score of 0.78,” the researchers write. 

Image of a coral reef with boxes and labeled percentage of COTS certainty around 3 different starfish.
Figure 1. A rendering of labeled COTS starfish on a reef (credit: Google/CSIRO)

According to the study, the project aims to showcase the capability of machine learning and AI technology applications for large-scale surveillance of ocean habitats. 

Their work is open source through the crown-of-thorns starfish detection pipeline on GitHub or on the Google Colab. The project is part of Google’s Digital Future Initiative with CSIRO.

Read more. >>

Categories
Misc

How to load a huge amount of images with tf.data

I’m newbie, I’ve working with tensorflow dataset so it’s my first time loading huge external data but there’s a problem, my data set it’s to big for my memory capacity.

I tried wit .flow_from_directory() but it seems to organize the data with classes and classes are the folders inside. This is not the case of my dataset, it’s a train folder -> a lot of folders with random names and inside there is the images, so .flow_from_directory() reads that random name as the label or the class. Is there a way to change that?

I’ve read the tf.data documentation but honestly, I don’t know how to solve my problem yet. I want to load all the data at the same time but it’s too big, so I need help. Please don’t only send me to read the documentation again :(.

submitted by /u/Current_Falcon_3187
[visit reddit] [comments]

Categories
Misc

[Discussion] Guys, I’m not a robot and I’d like to share a great TFX resource with you <3

Hi guys, my friend recently created a great online course describing how to apply BERT to sentiment analysis using TFX and Vertex AI. We want to reach as many people as possible because we spent several weeks working on it. The course is free, and we’ve also included the entire pipeline and helper library codebase, ready for use as a template in your project.

I’ve tried to share the link with you several times because I think it’s the perfect group (after all, we are showing the possibilities that Tensorflow Extended offers.

Unfortunately, my post gets deleted every time. So, I’m trying again, without the marketing bullshit.

BERT SENTIMENT ANALYSIS ON VERTEX AI USING TFX

Please check out this course and let me know what you think. I hope it will be helpful to some of you. And you think something is missing. Feel free to discuss. We will gladly add the missing parts!

Btw. Any ideas on how to add links to avoid being considered spam?

submitted by /u/Novel_Cryptographer6
[visit reddit] [comments]

Categories
Misc

Fantastical 3D Creatures Roar to Life ‘In the NVIDIA Studio’ With Artist Massimo Righi

The year of the tiger comes into focus this week In the NVIDIA Studio, which welcomes 3D creature artist Massimo Righi. An award-winning 3D artist with two decades of experience in the film industry, Righi has received multiple artist-of-the-month accolades and features in top creative publications.

The post Fantastical 3D Creatures Roar to Life ‘In the NVIDIA Studio’ With Artist Massimo Righi appeared first on NVIDIA Blog.

Categories
Misc

question about reading tfrecord

i would like to read a “.tfrecord” dataset in my python code and print some of the rows with all their individual values, whats the easiest way to do it? i have the tfrecord file and metadata.json

submitted by /u/samas69420
[visit reddit] [comments]

Categories
Misc

custom op in tensorflow

Can I write custom.op file in c, and compile it directly?

submitted by /u/Section_Disastrous
[visit reddit] [comments]

Categories
Misc

HPC Researchers Seed the Future of In-Network Computing With NVIDIA BlueField DPUs

Across Europe and the U.S., HPC developers are supercharging supercomputers with the power of Arm cores and accelerators inside NVIDIA BlueField-2 DPUs. At Los Alamos National Laboratory (LANL) that work is one part of a broad, multiyear collaboration with NVIDIA that targets 30x speedups in computational multi-physics applications. LANL researchers foresee significant performance gains using Read article >

The post HPC Researchers Seed the Future of In-Network Computing With NVIDIA BlueField DPUs appeared first on NVIDIA Blog.

Categories
Misc

Scientists Building Digital Twins in NVIDIA Omniverse to Accelerate Clean Energy Research

As global climate change accelerates, finding and securing clean energy is a crucial challenge for many researchers, organizations and governments. The U.K.’s Atomic Energy Authority (UKAEA), through an evaluation project at the University of Manchester, has been testing the NVIDIA Omniverse simulation platform to accelerate the design and development of a full-scale fusion powerplant that Read article >

The post Scientists Building Digital Twins in NVIDIA Omniverse to Accelerate Clean Energy Research appeared first on NVIDIA Blog.

Categories
Misc

The Road to the Hybrid Quantum-HPC Data Center Starts Here

It’s time to start building tomorrow’s hybrid quantum computers. The motivation is compelling, the path is clear and key components for the job are available today. Quantum computing has the potential to bust through some of today’s toughest challenges, advancing everything from drug discovery to weather forecasting. In short, quantum computing will play a huge Read article >

The post The Road to the Hybrid Quantum-HPC Data Center Starts Here appeared first on NVIDIA Blog.