Categories
Misc

New on NGC: SDKs for Large Language Models, Digital Twins, Digital Biology, and More

NVIDIA announces new SDKs available in the NGC catalog, a hub of GPU-optimized deep learning, machine learning, and HPC applications. With highly performant…

NVIDIA announces new SDKs available in the NGC catalog, a hub of GPU-optimized deep learning, machine learning, and HPC applications. With highly performant software containers, pretrained models, industry-specific SDKs, and Jupyter notebooks available, AI developers and data scientists can simplify and reduce complexities in their end-to-end workflows.

This post provides an overview of new and updated services in the NGC catalog, along with the latest advanced SDKs to help you streamline workflows and build solutions faster.

Simplifying access to large language models

Recent advances in large language models (LLMs) have fueled state-of-the-art performance for NLP applications, such as virtual scribes in healthcare, interactive virtual assistants, and many more. 

NVIDIA NeMo Megatron

NVIDIA NeMo Megatron, an end-to-end framework for training and deploying LLMs with up to trillions of parameters, is now available in open beta from the NGC catalog. It consists of an end-to-end workflow for automated distributed data processing; training large-scale customized GPT-3, T5, and multilingual T5 (mT5) models; and deploying models for inference at scale.

NeMo Megatron can be deployed on several cloud platforms, including Microsoft Azure, Amazon Web Services, and Oracle Cloud Infrastructure. It can also be accessed through NVIDIA DGX SuperPODs and NVIDIA DGX Foundry.

Request NeMo Megatron in open beta.

NVIDIA NeMo LLM

The NVIDIA NeMo LLM service provides the fastest path to customize foundation LLMs and deploy them at scale, using the NVIDIA-managed cloud API or through private and public clouds.

NVIDIA and community-built foundation models can be customized using prompt learning capabilities, which are compute-efficient techniques that embed context in user queries to enable greater accuracy in specific use cases. These techniques require just a few hundred samples to achieve high accuracy in building applications. These applications can range from text summarization and paraphrasing to story generation.

This service also provides access to the Megatron 530B model, one of the world’s largest LLMs with 530 billion parameters. Additional model checkpoints include 3B T5 and NVIDIA-trained 5B and 20B GPT-3.

Apply now for NeMo LLM early access.

NVIDIA BioNeMo

The NVIDIA BioNeMo service is a unified cloud environment for end-to-end, AI-based drug discovery workflows, without the need for IT infrastructure.

Today, the BioNeMo service includes two protein models, with models for DNA, RNA, generative chemistry, and other biology and chemistry models coming soon.

ESM-1 is a protein LLM, which was trained on 52 million protein sequences, and can be used to help drug discovery researchers understand protein properties, such as cellular location or solubility, and secondary structures, such as alpha helix or beta sheet.

The second protein model in the BioNeMo service is OpenFold, a PyTorch-based NVIDIA-optimized reproduction of AlphaFold2 that quickly predicts the 3D structure of a protein from its primary amino acid sequence.

With the BioNeMo service, chemists, biologists, and AI drug discovery researchers can generate novel therapeutics and understand the properties and function of proteins and DNA. Ultimately, they can combine many AI models in a connected, large-scale, in silico AI workflow that requires supercomputing scale over multiple GPUs.

BioNeMo will enable end-to-end modular drug discovery to accelerate research and better understand proteins, DNA, and chemicals.

Apply now for BioNeMo early access.

AI frameworks for 3D and digital twin workflows

A digital twin is a virtual representation—a true-to-reality simulation of physics and materials—of a real-world physical asset or system, which is continuously updated. Digital twins aren’t just for inanimate objects and people. They can replicate a fulfillment center process to test out human-robot interactions before activating certain robot functions in live environments and the applications are as wide as the imagination.

NVIDIA Omniverse Replicator

NVIDIA Omniverse Replicator is a highly extensible framework built on the NVIDIA Omniverse platform that enables physically accurate 3D synthetic data generation to accelerate the training and accuracy of perception networks.

Technical artists, software developers, and ML engineers can now easily build custom, physically accurate, synthetic data generation pipelines in the cloud or on-premises with the Omniverse Replicator container available from the NGC catalog.

Download the Omniverse Replicator container for self-service cloud deployment.

NVIDIA Modulus

NVIDIA Modulus is a neural network AI framework that enables you to create customizable training pipelines for digital twins, climate models, and physics-based modeling and simulation.

Modulus is integrated with NVIDIA Omniverse so that you can visualize the outputs of Modulus-trained models.  This interface enables interactive exploration of design variables and parameters for inferring new system behavior and visualizing it in near real time.

The latest release (v22.09), includes key enhancements to increase composition flexibility for neural operator architectures, features to improve training convergence and performance, and most importantly, significant improvements to the user experience and documentation.

Download the latest version of Modulus.

Deep learning software

The most popular deep learning frameworks for training and inference are updated monthly. Pull the latest version (v22.09):

New pretrained models

We are constantly adding state-of-the-art models for a variety of speech and vision models. The following pretrained models are new on NGC:

  • SLU Conformer-Transformer-Large SLURP: Performs joint intent classification and slot filling, directly from audio input.
  • Riva ASR Korean LM:  An automatic speech recognition (ASR) engine that can optionally condition the transcript output on n-gram language models.
  • LangID Ambernet: Used for spoken language identification (LangID or LID) and serves as the first step for ASR.
  • STT En Squeezeformer CTC Small Librispeech: A model for English ASR that is trained with NeMo on the LibriSpeech dataset.
  • TTS De FastPitch HiFi-GAN: This collection contains two models: FastPitch, which was trained on over 23 hours of German speech from one speaker, and HiFi-GAN, which was trained on mel spectrograms produced by the FastPitch model.

Explore more pretrained models for common AI tasks on the NGC Models page.

Categories
Misc

An AIoT Solution for Visual Blockage Detection at Culverts

One of the key contributors in originating flash floods is the blockage of cross-drainage hydraulic structures, such as culverts, by unwanted, flood-borne…

One of the key contributors in originating flash floods is the blockage of cross-drainage hydraulic structures, such as culverts, by unwanted, flood-borne debris being transported.

The accumulation and interaction of debris with culverts often result in reduced hydraulic capacity, diversion of upstream flows, and structural failure. For example, the Newcastle, Australia floods in 2007, Wollongong, Australia floods in 1998 and Pentre, United Kingdom floods in 2021, are just a few instances where blockages were reported as a primary reason for cross-drainage hydraulic structure failure.

In this post, we describe our technique for building a diverse visual dataset for computer vision model training, including examples of synthetic images. We break down each component of our solution and provide insights on future research directions.

Problem

Non-linear debris accumulation, the unavailability of real-time data, and complex hydrodynamics suggested the invalidity of a conventional numerical modeling-based approach to address the problem. In this context, post-flood visual information was used to develop the blockage policies involving several assumptions, which many argue are not a true representative of blockage.

This suggests the need for better understanding and exploring the blockage issue from a technology perspective to aid flood management officials and policymakers.

StopBlock: A technology initiative to monitor the visual blockage of culverts

To help address the blockage problem, StopBlock was initiated as a part of SMART Stormwater Management. Overall, this project involved collaboration between city councils in the Illawarra (Wollongong, Shellharbour, and Kiama) and Shoalhaven regions, Lendlease, and the University of Wollongong’s SMART Infrastructure Facility.

StopBlock aims to assess and monitor the visual blockage at culverts in real time using the latest technologies:

  • Artificial intelligence
  • Computer vision
  • Edge computing
  • Internet of Things (IoT)
  • Intelligent video analytics

In addition, we build and deployed an artificial intelligence of things (AIoT) solution using NVIDIA edge computing, the latest computer vision detection and classification models, a CCTV camera, and a 4G module. The solution detected the visual blockage status (blocked, partially blocked, or clear) at three culvert sites within the Illawarra region.

Building visual datasets for computer vision model training

Training computer vision CNN models requires numerous images related to the intended task. The problem of culvert blockage detection has not been addressed from this perspective before. No database of image data and datasets exists for this purpose.

We developed a new training database consisting of diverse image data related to culvert blockage. These images showed varying culvert types, debris types, camera angles, scaling, and lighting conditions.

Limited data from real culvert blockage was available through the city council records. We adopted the idea of using the combination of real, lab-simulated, and synthetic visual data.

Images of culvert openings and blockage

We collected real images of culverts (blocked and clear) from multiple sources:

  • City council historical records
  • Online repositories
  • Local culvert sites

The collected images represent great diversity in terms of culvert types, debris types, illumination conditions, camera viewpoints, scale, resolution, and even backgrounds. The images of culvert openings and blockages (ICOB) dataset consisted of 929 images in total.

Photos of a few selected culvert samples from the ICOB data set with bounding box annotations.
Figure 1. Samples from the ICOB dataset

Visual hydraulics-lab blockage dataset

We collected simulated images from scaled laboratory experiments to optimize the existing visual dataset, as not enough real images were available.

A thorough hydraulics laboratory investigation was performed where a series of experiments used scaled physical models of culverts. Blockage scenarios used scaled debris (urban and vegetative) under various flooding conditions.

The images represented diversity in terms of culvert types (single circular, double circular, single box, or double box), blockage types (urban, vegetative, or mixed), simulated lighting conditions, camera viewpoints (two cameras), and flooding conditions (inlet discharge levels). However, the dataset was limited in terms of reflections, clear water, identical background, and identical scaling.

In total, we collected 1,630 images from these experiments to establish the VHD dataset.

Photos of culvert samples from the VHD dataset with bounding box annotations.
Figure 2. Samples from the VHD dataset

Synthetic images of culverts

We generated synthetic images of culverts (SIC) using a three-dimensional computer application based on the Unity gaming engine with the goal of enhancing the datasets for training.

The application is specifically designed to simulate culvert blockage scenarios and can generate virtually countless instances of blocked culverts with any possible blockage situation that you can think of. You can also alter culvert types, water levels, debris types, camera viewpoints, time of the day, and scaling.

The app design enables you to select scene features from dropdown menus and drag debris objects from a library to place anywhere in the scene with any possible orientation. You can write code using parameters to recreate multiple scenarios and batch capture the images with corresponding labels, to aid the training process.

Some highlighted limitations included unrealistic effects and animations and a single natural background. Figure 3 shows samples from the SIC dataset.

Photos of a few selected culvert samples from the SIC dataset with bounding box annotations.
Figure 3. Samples from the SIC dataset

AIoT system development

We developed an AIoT solution using edge computing hardware, computer vision models, and sensors for the real-time visual blockage monitoring at culverts:

  • A CCTV camera to capture the culvert.
  • NVIDIA TX2–powered edge compute to process and infer blockage images using trained computer vision models.
  • 4G connectivity to transmit blockage-related data to a web-based dashboard.
  • Computer vision models to detect and classify the visual blockage at culverts.

More specifically, in terms of software, a two-stage detection-classification pipeline is adopted (Figure 4).

Detection stage

In the first stage, a computer vision object detection model (YOLOv4) is used to detect the culvert openings. The detected openings are cropped from the original image and are processed for the classification stage. If no culvert opening is detected, an alert is issued to suggest that the culvert might be submerged.

Classification stage

At the second stage, a CNN classification model such as ResNet-50) is used to classify the cropped culvert openings into one of three blockage classes (blocked, partially blocked, or clear). The blockage-related information is then transmitted to a web dashboard for flood management officials to facilitate the decision-making process.

Flow diagram shows the approach of sequentially detecting culvert visible openings and classifying them as clear or blocked.
Figure 4. A two-stage detection-classification pipeline for visual blockage detection at culverts

We trained the YOLOv4 and ResNet-50 models used for detection and classification, respectively, using the NVIDIA TAO platform powered by Python, TensorFlow, and Keras. We used a Linux machine equipped with the NVIDIA A100 GPU for training the models using images from the ICOB, VHD, and SIC datasets.

Here’s the four-stage approach adopted for development:

  • Stage I: We prepared a dataset from real and simulated images.
  • Stage II: We selected detection and classification models from the NVIDIA TAO model zoo and trained them using the TAO platform.
  • Stage III: We exported trained models to be deployed on the NVIDIA TX2 edge computer.
  • Stage IV: In the field, we deployed a complete hardware system and collected real data for fine-tuning the computer vision algorithms.

Relating to software performance, the culvert opening detection model achieved the validation mAP of 0.90 while the blockage classification model achieved a validation accuracy of 0.88.

We developed an end-to-end video analytics pipeline on the NVIDIA DeepStream 6 SDK, using the trained computer vision models to make the inference on the NVIDIA TX2-powered edge computer. Using these detection and classification models, the DeepStream pipeline achieved the FPS of 24.8 for NVIDIA TX2 hardware.

We built the smart device for culvert blockage monitoring using a CCTV camera, NVIDIA TX2 edge computer, and 4G dongle (Figure 5). We optimized the developed hardware for power consumption and computational time for real-time utility. Powered by a solar panel, the hardware consumes only 9.1W average power. The AIoT solution is also configured to transmit the blockage metadata every hour to the web dashboard.

The solution is configured to consider the privacy issues and avoid storing any images on board or in the cloud. Instead, it only processes the images and transmits the blockage metadata. Figure 5 shows the installation of the AIoT hardware at one of the remote sites to monitor the culvert visual blockage.

Photo of the AIoT hardware setup based on NVIDIA Jetson TX2 and a culvert photo showing hardware deployment on poles.
Figure 5. AIoT hardware setup (left) and field deployment (right) for real-time culvert visual blockage monitoring

Future research directions

The potential of computer vision can be further explored to establish a better understanding of visual blockage by extracting blockage-related information:

  • Percentage visual blockage estimation
  • Flood-borne debris type recognition
  • Partially automated visual blockage classification

Percentage visual blockage estimation

In the context of flood management decision making, knowing the blockage status of a given culvert is not always enough to make a maintenance-related decision. Going one step further and estimating the percentage visual blockage at a given culvert assists flood management officials in prioritizing the culverts with high visual blockage.

A segmentation-classification pipeline to segment the visible openings from image and classifying the segmented masks into one of four percentage visual blockage classes can be one of the potential solutions. Figure 6 shows the conceptual block diagram for the percentage visual blockage estimation.

Diagram shows the process of extracting the visible culvert opening masks using Mask R-CNN and classifying them into percentage visual blockage classes using CNN classification model.
Figure 6. Conceptual diagram for the percentage visual blockage estimation at culverts use case

Flood-borne debris type recognition

The type of flood-borne debris interacting and accumulating at the culvert can result in distinct flooding impacts. Usually, vegetative debris is considered less concerning because of its porous nature in comparison to compact, urban debris.

Automatic detection of debris type is another crucial aspect to be explored.

Partially automated visual blockage classification

A CNN classification model may be used to facilitate the manual culvert inspections as a simplistic solution while keeping the flood management official in the loop. Given the complexity of the problem and preliminary analysis, it is not possible to only use a CNN classification model to automate the process. However, a partially automated framework can be developed to facilitate the process.

Figure 7 shows the concept of such a framework based on the classification probability of the trained model. If the classification probability for a given image is less than a given threshold, it can be flagged to flood management officials for cross-validation.

Diagram shows the partially assisted deep learning classification framework for the visual blockage detection at culverts. Images classified by the deep learning model with less than 80% confidence are manually assisted by flood management experts.
Figure 7. Partially automated visual blockage classification

Summary

We provided an edge-computing solution for the visual blockage detection at the culverts to assist the timely maintenance and to avoid the blockage-related flooding events.

A classification-detection computer vision model is developed and deployed using the NVIDIA edge-computing hardware to retrieve the blockage status of a culvert as “clear,” “blocked,” or “partially blocked.” To facilitate the training of computer vision models for this unique problem domain, we used simulated and artificially generated images related to culvert visual blockage.

There is a tremendous scope of extending the provided solution in multiple ways to achieve further improved and additional visual blockage information. Estimation of percentage visual blockage, detection of flood-borne debris, and developing a partially automated visual blockage classification framework are a few potential enhancements that can be made within the existing solution.

Categories
Misc

Upcoming Event: Level Up with NVIDIA: RTX in Unity

Learn how to leverage the latest NVIDIA RTX technology in Unity Engine and connect with experts during a live Q&A at this webinar on November 16.

Learn how to leverage the latest NVIDIA RTX technology in Unity Engine and connect with experts during a live Q&A at this webinar on November 16.

Categories
Misc

Stormy Weather? Scientist Sharpens Forecasts With AI

Editor’s note: This is the first in a series of blogs on researchers advancing science in the expanding universe of high performance computing. A perpetual shower of random raindrops falls inside a three-foot metal ring Dale Durran erected outside his front door (shown above). It’s a symbol of his passion for finding order in the Read article >

The post Stormy Weather? Scientist Sharpens Forecasts With AI appeared first on NVIDIA Blog.