In my model I have a keras.layers.Lambda layer as the output layer. I use the layer to do some post-processing and return either a 0 or a 1. The problem arises when I run the predict function, the lambda layer only returns a small portion of the last batch data.
Like imagine you have a lambda function that returns the same constant no matter the input given, then the layer won’t give predictions on all data that was passed to it.
def output_function(x): # Return 1 no matter the input return 1 model = keras.Sequential([ keras.layers.Dense(1, activation='sigmoid'), keras.layers.Lambda(output_function) ]) test_data = tf.random.uniform(shape=(79, 1)) model.predict(test_data, batch_size=32) # Returns an array of only 3 predictions
I suspect that it is because I don’t return a tensor of the shape (None, 1), but rather just (1). But I don’t know how to rewrite the output_function to return a tensor of shape (None, 1).
Researchers create a new AI algorithm that can analyze mammography scans, identify whether a lesion is malignant, and show how it reached its conclusion.
A recently developed AI platform is giving medical professionals screening for breast cancer a new, transparent tool for evaluating mammography scans. The research, creates an AI model that evaluates the scans and highlights parts of an image the algorithm finds relevant. The work could help medical professionals determine whether a patient needs an invasive—and often nerve-wracking—biopsy.
“If a computer is going to help make important medical decisions, physicians need to trust that the AI is basing its conclusions on something that makes sense,” Joseph Lo, professor of radiology at Duke and study coauthor said in a press release. “We need algorithms that not only work, but explain themselves and show examples of what they’re basing their conclusions on. That way, whether a physician agrees with the outcome or not, the AI is helping to make better decisions.”
One in every eight women in the US will develop invasive breast cancer during their lifetime. When detected early, a woman has a 93 percent or higher survival rate in the first 5 years.
Mammography, which uses low-energy X-rays to examine breast tissue for diagnosis and screening, is an effective tool for early detection, but requires a highly skilled radiologist to interpret the scans. However, false negatives and positives do occur, resulting in missed diagnosis and up to 40% of biopsied lesions being benign.
Using AI for medical imaging analysis has grown significantly in recent years and offers advantages in interpreting data. Implementing AI models also carries risks, especially when an algorithm fails.
“Our idea was to instead build a system to say that this specific part of a potential cancerous lesion looks a lot like this other one that I’ve seen before,” said study lead and Duke computer science Ph.D. candidate Alina Barnett. “Without these explicit details, medical practitioners will lose time and faith in the system if there’s no way to understand why it sometimes makes mistakes.”
Using 1,136 images from 484 patients within the Duke University Health System, researchers trained the algorithm to locate and evaluate potentially cancerous areas. This was accomplished by training the models to identify unhealthy tissue, or lesions, which often appear as bright or irregular shapes with fuzzy edges on a scan.
Radiologists then labeled these images, teaching the algorithm to focus on the fuzzy edges, also known as margins. Often associated with quick-growing cancerous breast tumor cells, margins are a strong indicator of cancerous lesions. With these carefully labeled images, the AI can compare cancerous and benign edges, and learn to distinguish between them.
The AI model uses the cuDNN-accelerated PyTorch deep learning framework and can be run on two NVIDIA P100 or V100 GPUs.
The researchers found the AI to be as effective as other machine learning-based mammography models, but it holds the advantage of having transparency in its decision-making. When the model is wrong, a radiologist can see how the mistake was made.
According to the study, the model could also be a useful tool when teaching medical students how to read mammogram scans and for resource-constrained areas of the world lacking cancer specialists.
The code from the study is available through GitHub.
Deep learning models require a lot of data to produce accurate predictions. Here’s how to solve the data processing problem for the medical domain with NVIDIA DALI.
Deep learning models require vast amounts of data to produce accurate predictions, and this need becomes more acute every day as models grow in size and complexity. Even large datasets, such as the well-known ImageNet with more than a million images, are not sufficient to achieve state-of-the-art results in modern computer vision tasks.
For this purpose, data augmentation techniques are required to artificially increase the size of a dataset by introducing random disturbances to the data, such as geometric deformations, color transforms, noise addition, and so on. These disturbances help produce models that are more robust in their predictions, avoid overfitting, and deliver better accuracy.
In medical imaging tasks, data augmentation is critical because datasets contain mere hundreds or thousands of samples at best. Models, on the other hand, tend to produce large activations that require a lot of GPU memory, especially when dealing with volumetric data such as CT and MRI scans. This typically results in training with small batch sizes on a small dataset. To avoid overfitting, more elaborate data preprocessing and augmentation techniques are required.
Preprocessing, however, often has a significant impact on the overall performance of the system. This is especially true in applications dealing with large inputs, such as volumetric images. These preprocessing tasks are typically run on the CPU due to simplicity, flexibility, and availability of libraries such as NumPy.
In some applications, such as segmentation or detection in medical images, the GPU utilization during training is usually suboptimal as data preprocessing is usually performed in the CPU. One of the solutions is to attempt to overlap data processing and training fully, but it is not always that simple.
Such a performance bottleneck leads to a chicken and egg problem. Researchers avoid introducing more advanced augmentations into their models due to performance reasons, and libraries don’t put the effort into optimizing preprocessing primitives due to low adoption.
GPU acceleration solution
You can improve the performance of applications with heavy data preprocessing pipelines significantly by offloading data preprocessing to the GPU. The GPU is typically underutilized in such scenarios but can be used to do the work that the CPU cannot complete in time. The result is better hardware utilization, and ultimately faster training.
This difference becomes more significant when you look at the NVIDIA submission for the MLPerf UNet3D benchmark. It used the same network architecture as in the BraTS21 winning solution but with a more complex data loading pipeline and larger input volumes (KITS19 dataset). The performance boost is an impressive 2x end-to-end training speedup when compared with the native pipeline (Figure 2).
This was made possible by NVIDIA Data Loading Library (DALI). DALI provides a set of GPU-accelerated building blocks, enabling you to build a complete data processing pipeline that includes data loading, decoding, and augmentation, and to integrate it with a deep learning framework of choice (Figure 3).
Volumetric image operations
Originally, DALI was developed as a solution for images classification and detection workflows. Later, it was extended to cover other data domains, such as audio, video, or volumetric images. For more information about volumetric data processing, see 3D Transforms or Numpy Reader.
DALI supports a wide range of image-processing operators. Some can also be applied to volumetric images. Here are some examples worth mentioning:
Random object bounding box
To showcase some of the mentioned operations, we use a sample from the BraTS19 dataset, consisting of MRI scans labeled for brain tumor segmentation. Figure 4 shows a two-dimensional slice extracted from a brain MRI scan volume, where the darker region represents a region labeled as an abnormality.
Resize upscales or downscales the image to a desired shape by interpolating the input pixels. The upscale or downscale is configurable for each dimension separately, including the selection of the interpolation method.
Warp affine operator
Warp affine applies a geometric transformation by mapping pixel coordinates from source to destination with a linear transformation.
Warp affine can be used to perform multiple transformations (rotation, flip, shear, scale) in one go.
Rotate allows you to rotate a volume around an arbitrary axis, provided as a vector, and an angle. It can also optionally extend the canvas so that the entire rotated image is contained in it. Figure 7 shows an example of a rotated volume.
Random object bounding box operator
Random object bounding box is an operator suited for detection and segmentation tasks. As mentioned earlier, medical datasets tend to be rather small, with target classes (such as abnormalities) occupying a comparatively small area. Furthermore, in many cases the input volume is much larger than the volume expected by the network. If you were to use random cropping windows for training, then the majority would not contain the target. This could cause the training convergence to slow down or bias the network towards false-negative results.
This operator selects pseudo-random crops that can be biased towards sampling a particular label. Connected component analysis is performed on the label map as a pre-step. Then, a connected blob is selected at random, with equal probability. By doing that, the operator avoids overrepresenting larger blobs.
You can also select to restrict the selection to the largest K blobs or specify a minimum blob size. When a particular blob is selected, a random cropping window is generated, within the range containing the given blob. Figure 8 shows this cropping window selection process.
The gain in learning speed can be significant. On the KITS19 dataset, nnU-Net achieves the same accuracy in 2134 in the test run epochs with the Random object bounding box operator as in 3,222 epochs with random crop.
Typically, the process of finding connected components is slow, but the number of samples in the data set can be small. The operator can be configured to cache the connected component information, so that it’s only calculated during the first epoch of the training.
I have trained a model using tf1.15, converted it to tflite and compiled it for the edgetpu, which is all working. However, my confidence values for all bounding boxes is 50%. It seems to recognise what is what somewhat well, such as putting many boxes over and around the object it’s trying to detect, but they’re all 50% confidence so it is difficult to get a clean output. I believe the issue was not caused by any conversions, but during training, as tensorboard also shows this.
Browse through MORF Gallery — virtually or at an in-person exhibition — and you’ll find robots that paint, digital dreamscape experiences, and fine art brought to life by visual effects. The gallery showcases cutting-edge, one-of-a-kind artwork from award-winning artists who fuse their creative skills with AI, machine learning, robotics and neuroscience. Scott Birnbaum, CEO and Read article >
With a new year underway, NVIDIA is helping enterprises worldwide add modern workloads to their mainstream servers using the latest release of the NVIDIA AI Enterprise software suite. NVIDIA AI Enterprise 1.1 is now generally available. Optimized, certified and supported by NVIDIA, the latest version of the software suite brings new updates including production support Read article >
GANcraft is a hybrid neural rendering pipeline to represent large and complex scenes using Minecraft.
Scientists at NVIDIA and Cornell University introduced a hybrid unsupervised neural rendering pipeline to represent large and complex scenes efficiently in voxel worlds. Essentially, a 3D artist only needs to build the bare minimum, and the algorithm will do the rest to build a photorealistic world. The researchers applied this hybrid neural rendering pipeline to Minecraft block worlds to generate a far more realistic version of the Minecraft scenery.
Previous works from NVIDIA and the broader research community (pix2pix, pix2pixHD, MUNIT, SPADE) have tackled the problem of image-to-image translation (im2im)—translating an image from one domain to another. At first glance, these methods might seem to offer a simple solution to the task of transforming one world to another—translating one image at a time. However, im2im methods do not preserve viewpoint consistency, as they have no knowledge of the 3D geometry, and each 2D frame is generated independently. As can be seen in the images that follow, the results from these methods produce jitter and abrupt color and texture changes.
MUNIT SPADE wc-vid2vid NSVF-W GANcraft
Enter GANcraft, a new method that directly operates on the 3D input world.
“As the ground truth photorealistic renderings for a user-created block world simply doesn’t exist, we have to train models with indirect supervision,” the researchers explained in the study.
The method works by randomly sampling camera views in the input block world and then imagining what a photorealistic version of that view would look like. This is done with the help of SPADE, prior work from NVIDIA on image-to-image translation, and was the key component in the popular GauGAN demo. GANcraft overcomes the view inconsistency of these generated “pseudo-groundtruths” through the use of a style-conditioning network that can disambiguate the world structure from the rendering style. This enables GANcraft to generate output videos that are view consistent, as well as with different styles as shown in this image!
While the results of the research are demonstrated in Minecraft, the method works with other 3D block worlds such as voxels. The potential to shorten the amount of time and expertise needed to build high-definition worlds increases the value of this research. It could help game developers, CGI artists, and the animation industry cut down on the time it takes to build these large and impressive worlds.
If you would like a further breakdown of the potential of this technology, Károly Zsolnai-Fehér highlights the research in his YouTube series: Two Minute Papers:
GANcraft was implemented in the Imaginaire library. This library is optimized for the training of generative models and generative adversarial networks, with support for multi-GPU, multi-node, and automatic mixed-precision training. Implementations of over 10 different research works produced by NVIDIA, as well as pretrained models have been released. This library will continue to be updated with newer works over time.
The GPU-accelerated Clara Parabricks v3.7 release brings support for gene panels, RNA-Seq, short tandem repeats, and updates to GATK 4.2 and DeepVariant 1.1.
The newest release of GPU-powered NVIDIA Clara Parabricks v3.7 includes updates to germline variant callers, enhanced support for RNA-Seq pipelines, and optimized workflows for gene panels. With now over 50 tools, Clara Parabricks powers accurate and accelerated genomic analysis for gene panels, exomes, and genomes for clinical and research workflows.
To date, Clara Parabricks has demonstrated 60x accelerations for state-of-the-art bioinformatics tools for whole genome workflows (end-to-end analysis in 22 minutes) and exome workflows (end-to-end analysis in 4 minutes), compared to CPU-based environments. Large-scale sequencing projects and other whole genome studies are able to analyze over 60 genomes/day on a single DGX server while both reducing the associated costs and generating more useful insights than ever before.
Accelerate and simplify gene panel workflows with the support of Unique Molecular Identifiers (UMIs) also known as molecular barcodes.
RNA-Seq support for transcriptome workflows with second gene fusion caller Arriba and RNA-Seq quantification tool Kallisto.
Short tandem repeat (STR) detection with ExpansionHunter.
Integration of the latest versions of germline callers DeepVariant v1.1 and GATK v4.2 with HaplotypeCaller.
A 10x accelerated BAM2FASTQ tool for converting archived data stored as either BAM or CRAM files back to FASTQ. Datasets can be updated by aligning to new and improved references.
Clara Parabricks 3.7 accelerates and simplifies gene panel analysis
While whole genome sequencing (WGS) is growing due to large-scale population initiatives, gene panels still dominate clinical genomic analysis. With time being one of the most important factors in clinical care, accelerating and simplifying gene-panel workflows is incredibly important for clinical sequencing centers. By further reducing the analysis bottleneck associated with gene panels, these sequencing centers can return results to clinicians faster, improving the quality of life for their patients.
Cancer samples used for gene panels are commonly derived from either a solid tumor or consist of cell-free DNA from the blood (liquid biopsies) of a patient. Compared to the discovery work in WGS, gene panels are narrowly focused on identifying genetic variants in known genes that either cause disease or can be targeted with specific therapies.
Gene panels for inherited diseases are often sequenced to 100x coverage while gene panels in cancer sequencing are sequenced to a much higher depth, up to several 1,000x for liquid biopsy samples. The higher coverage is required to detect lower frequency somatic mutations associated with cancer.
To improve the limit of detection for these gene panels, molecular barcodes or UMIs are used, as they significantly reduce the background noise. This limit of detection is pushed for liquid biopsies, and can include tens of thousands coverage in combination with UMIs to identify those needle-in-the-haystack somatic mutations circulating in the bloodstream. High-depth gene-panel sequencing can reintroduce a computational bottleneck in the required processing of many more sequencing reads.
With Clara Parabricks, UMI gene panels can now be processed 10x faster than traditional workflows, generating results in less than an hour.
The analysis workflow is also simplified. From raw FASTQ to consensus FASTQ, a single command line runs multiple inputs compared to the traditional Fulcrum Genomics (Fgbio) equivalent as seen in the example below.
RNA-Seq support with Arriba and Kallisto
Just as gene panels are important for sequencing cancer analysis, so too are RNA-Seq workflows for transcriptome analysis. In addition to STAR-fusion, Clara Parabricks v3.7 now includes Arriba, a fusion detection algorithm based on the STARRNA-Seq aligner. Gene fusions, in which two distinct genes join due to a large chromosomal alteration, are associated with many different types of cancer from leukemia to solid tumors.
Arriba can also detect viral integration sites, internal tandem duplications, whole exon duplications, circular RNAs, enhancer hijacking events involving immunoglobulin/T-cell receptor loci, and breakpoints in introns or intergenic regions.
Clara Parabricks v3.7 also incorporates Kallisto, a fast RNA-Seq quantification tool based on pseudo-alignment that identifies transcript abundances (aka gene expression levels based on sequencing read counts) from either bulk or single-cell RNA-Seq datasets. Alignment of RNA-Seq data is the first step of the RNA-Seq analysis workflow. With tools for transcript quantification, read alignment, and fusion calling, Clara Parabricks 3.7 now provides a full suite of tools to support multiple RNA-Seq workflows.
Short tandem repeat detection with ExpansionHunter
To support genotyping of short tandem repeats (STRs) from short-read sequencing data, ExpansionHunter support has been added to Parabricks v3.7. STRs, also referred to as microsatellites, are ubiquitous in the human genome. These regions of noncoding DNA have accordion-like stretches of DNA containing core repeat units between two and seven nucleotides in length, repeated in tandem, up to several dozen times.
STRs are extremely useful in applications such as the construction of genetic maps, gene location, genetic linkage analysis, identification of individuals, paternity testing, population genetics, and disease diagnosis.
There are a number of regions in the human genome consisting of such repeats, which can expand in their number of repetitions, causing disease. Fragile X Syndrome, ALS, and Huntington’s Disease are well-known examples of repeat-associated diseases.
ExpansionHunter aims to estimate sizes of repeats by performing a targeted search through a BAM/CRAM file for sequencing reads that span, flank, and are fully contained in each STR.
The addition of somatic callers and support of archived data
The previous releases of Clara Parabricks in 2021 brought a host of new tools, most importantly the addition of five somatics callers—MuSE, LoFreq, Strelka2, Mutect2, and SomaticSniper—for comprehensive cancer genomic analysis.
In addition, tools were added to take advantage of archived data in scenarios when original FASTQ files were deleted to save storage space. BAM2FASTQ is an accelerated version of GATK Sam2fastq, which converts an existing BAM or CRAM file to a FASTQ file. This allows users to realign sequencing reads to a new reference genome which will enable more variants to be called, along with providing researchers capabilities to normalize all their data to the same reference. Using the existing fq2bam tool, Clara Parabricks can realign reads from one reference to another in under 2 hours for a 30X whole genome.