Categories
Misc

Achieve up to 75% Performance Improvement for Communication Intensive HPC Applications with NVTAGS

NVTAGS automates intelligent GPU assignment by profiling HPC applications and launching them with a custom GPU assignment tailored to an application and system to minimize communication costs.

Many GPU-accelerated HPC applications spend a substantial portion of their time in non-uniform, GPU-to-GPU communications. Additionally, in many HPC systems, different GPU pairs share communication links with varying bandwidth and latency. As a result, GPU assignment can substantially impact time to solution. Furthermore, on multi-node / multi-socket systems, communication performance can degrade when GPUs communicate with CPUs and NICs outside their system affinity. Because resource selection is system dependent, it is challenging to select resources such that communication costs are minimized.

NVIDIA Topology-Aware GPU Selection (NVTAGS) abstracts away the complexity of efficient resource selection. NVTAGS automates intelligent GPU assignment by profiling HPC applications and launching them with a custom GPU assignment tailored to an application and system to minimize communication costs. NVTAGS ensures that, regardless of a system’s communication topology, MPI processes communicate with the CPUs and NICs or HCAs within their own affinity. 

NVTAGS improves performance of Chroma, MILC, and LAMMPS from 2% to 75% on one to 16 nodes.

Key NVTAGS Features:

  • Automated topology detection along with CPU and NIC/HCA binding, independent of the system and HPC application
  • Support for single- and multi-node, PCIe, and NVIDIA NVLink with NVIDIA Pascal, Volta, and Ampere architecture GPUs
  • Automatic caching of efficient GPU selection for future simulations
  • Straightforward integration with Slurm and Singularity

Download NVTAGS 1.0.0 today. 

Additional Resources:

NVTAGS Product Page
Blog: Overcoming Communication Congestion for HPC Applications with NVIDIA NVTAGS

Categories
Offsites

Improving Genomic Discovery with Machine Learning

Each person’s genome, which collectively encodes the biochemical machinery they are born with, is composed of over 3 billion letters of DNA. However, only a small subset of the genome (~4-5 million positions) varies between two people. Nonetheless, each person’s unique genome interacts with the environment they experience to determine the majority of their health outcomes. A key method of understanding the relationship between genetic variants and traits is a genome-wide association study (GWAS), in which each genetic variant present in a cohort is individually examined for correlation with the trait of interest. GWAS results can be used to identify and prioritize potential therapeutic targets by identifying genes that are strongly associated with a disease of interest, and can also be used to build a polygenic risk score (PRS) to predict disease predisposition based on the combined influence of variants present in an individual. However, while accurate measurement of traits in an individual (called phenotyping) is essential to GWAS, it often requires painstaking expert curation and/or subjective judgment calls.

In “Large-scale machine learning-based phenotyping significantly improves genomic discovery for optic nerve head morphology”, we demonstrate how using machine learning (ML) models to classify medical imaging data can be used to improve GWAS. We describe how models can be trained for phenotypes to generate trait predictions and how these predictions are used to identify novel genetic associations. We then show that the novel associations discovered improve PRS accuracy and, using glaucoma as an example, that the improvements for anatomical eye traits relate to human disease. We have released the model training code and detailed documentation for its use on our Genomics Research GitHub repository.

Identifying genetic variants associated with eye anatomical traits
Previous work has demonstrated that ML models can identify eye diseases, skin diseases, and abnormal mammogram results with accuracy approaching or exceeding state-of-the-art methods by domain experts. Because identifying disease is a subset of phenotyping, we reasoned that ML models could be broadly used to improve the speed and quality of phenotyping for GWAS.

To test this, we chose a model that uses a fundus image of the eye to accurately predict whether a patient should be referred for assessment for glaucoma. This model uses the fundus images to predict the diameters of the optic disc (the region where the optic nerve connects to the retina) and the optic cup (a whitish region in the center of the optic disc). The ratio of the diameters of these two anatomical features (called the vertical cup-to-disc ratio, or VCDR) correlates strongly with glaucoma risk.

A representative retinal fundus image showing the vertical cup-to-disc ratio, which is an important diagnostic measurement for glaucoma.

We applied this model to predict VCDR in all fundus images from individuals in the UK Biobank, which is the world’s largest dataset available to researchers worldwide for health-related research in the public interest, containing extensive phenotyping and genetic data for ~500,000 pseudonymized (the UK Biobank’s standard for de-identification) individuals. We then performed GWAS in this dataset to identify genetic variants that are associated with the model-based predictions of VCDR.

Applying a VCDR prediction model trained on clinical data to generate predicted values for VCDR to enable discovery of genetic associations for the VCDR trait.

The ML-based GWAS identified 156 distinct genomic regions associated with VCDR. We compared these results to a VCDR GWAS conducted by another group on the same UK Biobank data, Craig et al. 2020, where experts had painstakingly labeled all images for VCDR. The ML-based GWAS replicates 62 of the 65 associations found in Craig et al., which indicates that the model accurately predicts VCDR in the UK Biobank images. Additionally, the ML-based GWAS discovered 93 novel associations.

Number of statistically significant GWAS associations discovered by exhaustive expert labeling approach (Craig et al., left), and by our ML-based approach (right), with shared associations in the middle.

The ML-based GWAS improves polygenic model predictions
To validate that the novel associations discovered in the ML-based GWAS are biologically relevant, we developed independent PRSes using the Craig et al. and ML-based GWAS results, and tested their ability to predict human-expert-labeled VCDR in a subset of UK Biobank as well as a fully independent cohort (EPIC-Norfolk). The PRS developed from the ML-based GWAS showed greater predictive ability than the PRS built from the expert labeling approach in both datasets, providing strong evidence that the novel associations discovered by the ML-based method influence VCDR biology, and suggesting that the improved phenotyping accuracy (i.e., more accurate VCDR measurement) of the model translates into a more powerful GWAS.

The correlation between a polygenic risk score (PRS) for VCDR generated from the ML-based approach and the exhaustive expert labeling approach (Craig et al.). In these plots, higher values on the y-axis indicate a greater correlation and therefore greater prediction from only the genetic data. [* — p ≤ 0.05; *** — p ≤ 0.001]

As a second validation, because we know that VCDR is strongly correlated with glaucoma, we also investigated whether the ML-based PRS was correlated with individuals who had either self-reported that they had glaucoma or had medical procedure codes suggestive of glaucoma or glaucoma treatment. We found that the PRS for VCDR determined using our model predictions were also predictive of the probability that an individual had indications of glaucoma. Individuals with a PRS 2.5 or more standard deviations higher than the mean were more than 3 times as likely to have glaucoma in this cohort. We also observed that the VCDR PRS from ML-based phenotypes was more predictive of glaucoma than the VCDR PRS produced from the extensive manual phenotyping.

The odds ratio of glaucoma (self-report or ICD code) stratified by the PRS for VCDR determined using the ML-based phenotypes (in standard deviations from the mean). In this plot, the y-axis shows the probability that the individual has glaucoma relative to the baseline rate (represented by the dashed line). The x-axis shows standard deviations from the mean for the PRS. Data are visualized as a standard box plot, which illustrates values for the mean (the orange line), first and third quartiles, and minimum and maximum.

Conclusion
We have shown that ML models can be used to quickly phenotype large cohorts for GWAS, and that these models can increase statistical power in such studies. Although these examples were shown for eye traits predicted from retinal imaging, we look forward to exploring how this concept could generally apply to other diseases and data types.

Acknowledgments
We would like to especially thank co-author Dr. Anthony Khawaja of Moorfields Eye Hospital for contributing his extensive medical expertise. We also recognize the efforts of Professor Jamie Craig and colleagues for their exhaustive labeling of UK Biobank images, which allowed us to make comparisons with our method. Several authors of that work, as well as Professor Stuart MacGregor and collaborators in Australia and at Max Kelsen have independently replicated these findings, and we value these scientific contributions as well.

Categories
Misc

Run RAPIDS on Microsoft Windows 10 Using WSL 2—The Windows Subsystem for Linux

A tutorial to run your favorite Linux software, including NVIDIA CUDA, on Windows RAPIDS is now more accessible to Windows users! This post walks you through installing RAPIDS on Windows Subsystem for Linux (WSL). WSL is a Windows 10 feature that enables users to run native Linux command-line tools directly on Windows. Using this feature … Continued

This post was originally published on the RAPIDS AI Blog.

A tutorial to run your favorite Linux software, including NVIDIA CUDA, on Windows

RAPIDS is now more accessible to Windows users! This post walks you through installing RAPIDS on Windows Subsystem for Linux (WSL). WSL is a Windows 10 feature that enables users to run native Linux command-line tools directly on Windows. Using this feature does not require a dual boot environment, taking away complexity and hopefully saving you time. You’ll need access to an NVIDIA GPU with NVIDIA Pascal architecture or newer. Let’s get started right away.

Getting Started

To install RAPIDS, you’ll need to do the following:

  1. Install the latest builds from the Microsoft Insider Program.
  2. Install the NVIDIA preview driver for WSL 2.
  3. Install WSL 2.
  4. Install RAPIDS.

Steps 1–3 can be completed by following the NVIDIA CUDA on WSL guide. However, there are some gotchas. This article will walk through each section and point out what to look out for. We recommend opening a tab for the guide alongside this post to make sure that you don’t miss anything. Before you start, be aware that all the steps in the guide must be carried out in order. It’s particularly important that you install a fresh version of WSL 2 only after installing the new build and driver. Also note, CUDA toolkit will be installed along with RAPIDS in step 4. Therefore, stop following the CUDA on WSL guide after you reach the Setting up CUDA Toolkit section.

Installing the latest builds from the Microsoft Insider program

For your program to run correctly, you need to be using Windows Build version 20145 or higher. When installing the builds, some things to note are:

  • Start off by navigating to your Windows menu. Select Settings > Update and Security > Windows Update. Make sure that you don’t have any pending Windows updates. If you do, click the update button to ensure you’re starting out without any.
  • Dev Channel (previously Fast ring): Fast ring is mentioned in the guide as the channel you should download your build from. The name of this channel is now the Dev Channel. Windows call the process of updating and installing the latest builds ‘flighting.’ During this process, you must select the DEV Channel when choosing which updates to receive.
  • Downloading and updating requires a restart and can take up to 90mins. Feel free to grab a coffee while you wait ;).
  • After you’ve restarted your computer, check your build version by running winver via the Windows Run command. It can be a little tricky to identify the right number. Here’s what you should look for after a successful installation (BUILD 20145 or higher):
Image of updated Windows 10 OS Build. The build in the image is OS Build 21296.
Figure 1: Build version is now OS Build 21296 which is sufficient to run WSL2.

Once you’ve confirmed your build, move onto step 2.

Installing NVIDIA drivers

Next, you’ll need to install an NVIDIA Driver. Keep the following in mind:

  • Select the driver based on the type of NVIDIA GPU in your system. To verify your GPU type look for the NVIDIA Control Panel in your Start menu. The name should appear there. See the CUDA on Windows Subsystem for Linux (WSL) public preview for more information. 
  • Once the download is complete install the driver using the executable. We strongly recommend choosing the default location for saving it.
  • A check to ensure the driver install was successful is to run the command nvidia-smi in PowerShell. It should output a table with information about your GPU and the driver. You’ll notice the driver version is the same as the one you downloaded.
Image of NVIDIA-SMI table displayed in Windows Powershell. NVIDIA Driver version 465.21 has been correctly installed.
Figure 2: NVIDIA Driver has correctly been installed, version 465.21.

(Your table might be much shorter and not show any GPU processes. As long as you can see a table and no visible errors, your install should have been successful!) If your driver is successfully installed, let’s jump to step 3. If nothing appears, check if you’ve missed any of the steps and if your build version is correct.

Installing WSL 2

Next, you’ll install WSL 2 with a Linux distribution of your choice using the docs here. Make sure that the distribution you choose is supported by RAPIDS. You can confirm this here. The rest of this post describes the installation of WSL 2 with Ubuntu 18.04. These steps should work similarly with other supported distributions.

There are two ways you can install your RAPIDS supporting Linux distribution with WSL 2 on Windows 10. The instructions listed in the Windows guide can seem overwhelming so we’ve distilled it down to the most important parts here:

Using the command line

  • Open your command line and ensure you’re in the Admin role.
  • Find out which Linux distributions are available and support WSL by typing in the command wsl --list -online.
  • To install the distribution, use the command wsl --install -d .
  • For Ubuntu 18.04 this command translated to wsl --install -d Ubuntu-18.04 (be aware of the capital letter U.) This should download and install your Linux distribution.
  • Your selected distribution should either immediately open or appear in your Windows Start menu.
  • If this is not true for you, double-check that your Linux distribution and WSL install was successful by running wsl.exe -list. If no distribution appears, navigate to “Programs” in your Control Panel. Confirm that the “Windows Hypervisor Platform” and “Windows Subsystem for Linux” boxes are checked. It should look something like the image below. Once confirmed, reboot your computer and try running the install again (possibly twice.) Ideally, the WSL terminal should pop up right after the installation.
Image of Windows Features list. Windows Hypervisor Platform and Windows Subsystem for Linux boxes are checked successfully.
Figure 3: In case your WSL terminal install doesn’t work right away, make sure the folders checked preceding are checked on your system as well.
  • When opening your WSL terminal for the first time, you will be prompted to set up a default(non-root) user. Ensure that you do not skip this step, as you will need to be the root user to install other packages.
  • Once you’ve set the default user, proceed to reboot your machine. When you return, you’ll be all set for step 4.

Through the Microsoft Store

  • If you already know which distribution you would like to use, you can download and install it directly from the Microsoft Store on your machine.
  • You’ll need to set the default user and do a reboot in this case as well.

Once you’ve completed this step, you’re ready to install the CUDA Toolkit and almost done!

Install RAPIDS

  • If you don’t have it already, start by installing and activating Miniconda in your WSL terminal. We’ll be using the conda command to install the packages we need in this step.
  • You can install RAPIDS with a single conda command. Just type the following line in and you’re all set.
Type this command into your terminal to install RAPIDS.

To test your installation, start-up the RAPIDS virtual environment. You can do this by:

  • Typing out conda info --envs, which will let you know the name of the installed RAPIDS environment.
  • Note: cuDF is supported only on Linux and with Python versions 3.7 and later.
  • Finally, import any RAPIDS library or start a Jupyter notebook.

Hopefully, your installation was successful. RAPIDS is open-source, so if you managed to get this far and would like to contribute, take another look at the contributing guide of any of our libraries or join the RAPIDS Slack channel to find out more.

Categories
Misc

Sherd Alert: GPU-Accelerated Deep Learning Sorts Pottery Fragments as Well as Expert Archeologists

A pair of researchers at Northern Arizona University used GPU-based deep-learning algorithms to categorize sherds — tiny fragments of ancient pottery — as well as, or better than, four expert archaeologists. The technique, outlined in a paper published in the June issue of The Journal of Archaeological Science by Leszek Pawlowicz and Christian Downum focused Read article >

The post Sherd Alert: GPU-Accelerated Deep Learning Sorts Pottery Fragments as Well as Expert Archeologists appeared first on The Official NVIDIA Blog.

Categories
Misc

NVIDIA Studio Goes 3D: Real-Time Ray Tracing and AI Accelerate Adobe’s New Substance 3D Collection of Design Applications

The NVIDIA Studio ecosystem continues to deliver time-saving features and visual improvements to top creative applications. Today, Adobe announced a significant update to their 3D lineup, with new and improved tools available in the Adobe Substance 3D Collection: new versions of Substance 3D Painter, Designer and Sampler, as well as the new application Substance 3D Read article >

The post NVIDIA Studio Goes 3D: Real-Time Ray Tracing and AI Accelerate Adobe’s New Substance 3D Collection of Design Applications appeared first on The Official NVIDIA Blog.

Categories
Misc

As Fast as One Can Gogh: Turn Sketches Into Stunning Landscapes with NVIDIA Canvas

Turning doodles into stunning landscapes — there’s an app for that. The NVIDIA Canvas app, now available as a free beta, brings the real-time painting tool GauGAN to anyone with an NVIDIA RTX GPU. Developed by the NVIDIA Research team, GauGAN has wowed creative communities at trade shows around the world by using deep learning Read article >

The post As Fast as One Can Gogh: Turn Sketches Into Stunning Landscapes with NVIDIA Canvas appeared first on The Official NVIDIA Blog.

Categories
Misc

Into to Deep Learning project in TensorFlow 2.x and Python – free course from udemy

Into to Deep Learning project in TensorFlow 2.x and Python - free course from udemy submitted by /u/Ordinary_Craft
[visit reddit] [comments]
Categories
Misc

Concating 3 multivariate sequences as an input to 1 model?

I’ve been trying to figure it out for about a week now but I keep getting ‘Data cardinality is ambiguous’. I’m creating a sequential model for each multivariate sequence, then concating the .output from each of those models as the input to a Keras model. I’m also feeding the inputs in as a list of each .input from each model.

Even when I make the last layer of each sequence’s model a dense layer with the same amount of units, the cardinality error still complain’s about concating different sequence lengths.

Any ideas or working code appreciated

submitted by /u/Techguy13
[visit reddit] [comments]

Categories
Misc

Metropolis Spotlight: Nota Is Transforming Traffic Management Systems With AI

Nota, an NVIDIA Metropolis partner, is using AI to make roadways safer and more efficient with NVIDIA’s edge GPUs and deep learning SDKs.

Nota, an NVIDIA Metropolis partner, is using AI to make roadways safer and more efficient with NVIDIA’s edge GPUs and deep learning SDKs.

Nota developed a real-time traffic control solution that uses image recognition technology to identify traffic volume and queues, analyze congestion, and optimize traffic signal controls at intersections. 

Using the DeepStream SDK off-the-shelf features, such as line crossing and setting a region of interest, Nota significantly improved how accurately it could examine traffic situations. Nota deployed the solution at a busy intersection in Pyeongtaek, South Korea to analyze traffic flow and control traffic lights in real-time. Nota was able to improve the traffic flow by 25% during regular hours, and by more than 300% during rush hour, saving the city traffic-congestion-related costs and reducing the time spent by drivers stuck in traffic. 

Read more in our solution showcase.

Categories
Misc

Metropolis Spotlight: INEX Is Revolutionizing Toll Road Systems with Real-time Video Processing

INEX Technologies, an NVIDIA Metropolis partner, designs, develops, and manufactures comprehensive hardware and software solutions for license plate recognition and vehicle identification.

INEX Technologies, an NVIDIA Metropolis partner, designs, develops, and manufactures comprehensive hardware and software solutions for license plate recognition and vehicle identification. 

The INEX RoadView solution provides automatic axle counting, vehicle classification, as well as lane zone tracking and triggering using LPR and RoadView cameras. RoadView video-based recognition eliminates the need for costly concrete cutting, in-ground loop maintenance, and axle-counting treadles.

NVIDIA GPUs are used to accelerate the real-time video analysis of the INEX ALPR system, which requires incredibly high accuracy along with high throughput and high frame rates. At the edge, the INEX uses the NVIDIA Jetson Nano and Jetson NX platform and embedded software stack.

Under the hood  

INEX video pipeline is based on the NVIDIA DeepStream SDK, which helps achieve super-optimized throughput, and makes it simpler to integrate complex classification and detection algorithms. INEX further leverages some of the world’s most powerful AI productivity tools by integrating NVIDIA pre-trained models and NVIDIA Transfer Learning Toolkit into their development workflow, reducing development time by a stunning 60%. And by going end-to-end with the full stack of NVIDIA hardware and software and deploying on NVIDIA Jetson edge platform, they reduced hardware and setup costs by 60% and lowered operating and maintenance costs by 50%.

The implications and impact for INEX are significant. Leveraging the NVIDIA platform, they can roll out world-class solutions performing challenging real-time vehicle detection and classification read licenses from all 50 states in the US and have expanded to countries in Europe, the Far East, the Middle East, and Australia. Tolling authorities upgrading to the INEX vehicle classification and ALPR system, can supercharge their toll systems quickly and easily – leveraging the latest AI technology.

Read more in our solution showcase.