So for example, let’s just say there is some unknown function of x, y=f(x). Now I can train the model with x and y and it would predict y when given x. But i don’t want exactly y, instead i want will y be greater than x or not? So the output should be 1 if it greater 0 if equal or lesser. How can I do this? So my data has 3 parameters: x, y, e(1 or 0). I need to train my model by giving out x and y and results as e. But then when I want predict i have to give x and y. Right? I want to give only x and get output 0 or 1. I could just train it with x as input and y as results and predict y from x and check them by myself but that is useless since I don’t want to know y it is just going to be waste of computational power.
Greetings, I am running a text classification task that tries to classify a text as belonging to one (and only one) of 25 classes.
I’ve used two accuracy metrics: tf.keras.metrics.Accuracy(), which was set as the default on the code I’m reusing, and tf.keras.metrics.CategoricalAccuracy(), as it seemed more appropriate.
CategoricalAccuracy is reporting a fairly good result of around 0.90, but the other Accuracy is reporting only 0.17. What exactly are the differences between these two, and am I doing something wrong?
Robotaxis are on their way to delivering safer transportation, driving across various landscapes and through starry nights. This week, Silicon Valley-based self-driving startup Pony.ai announced its next-generation autonomous computing platform, built on NVIDIA DRIVE Orin for high-performance and scalable compute. The centralized system will serve as the brain for a robotaxi fleet of Toyota Sienna Read article >
Hello, Operator. This GFN Thursday brings the launch of Tom Clancy’s Rainbow Six Extraction to GeForce NOW. Plus, four new games are joining the GeForce NOW library to let you start your weekend off right. Your New Mission, Should You Choose to Accept It Grab your gadgets and get ready to game. Tom Clancy’s Rainbow Read article >
In my model I have a keras.layers.Lambda layer as the output layer. I use the layer to do some post-processing and return either a 0 or a 1. The problem arises when I run the predict function, the lambda layer only returns a small portion of the last batch data.
Like imagine you have a lambda function that returns the same constant no matter the input given, then the layer won’t give predictions on all data that was passed to it.
For example:
def output_function(x): # Return 1 no matter the input return 1 model = keras.Sequential([ keras.layers.Dense(1, activation='sigmoid'), keras.layers.Lambda(output_function) ]) test_data = tf.random.uniform(shape=(79, 1)) model.predict(test_data, batch_size=32) # Returns an array of only 3 predictions
I suspect that it is because I don’t return a tensor of the shape (None, 1), but rather just (1). But I don’t know how to rewrite the output_function to return a tensor of shape (None, 1).
Researchers create a new AI algorithm that can analyze mammography scans, identify whether a lesion is malignant, and show how it reached its conclusion.
A recently developed AI platform is giving medical professionals screening for breast cancer a new, transparent tool for evaluating mammography scans. The research, creates an AI model that evaluates the scans and highlights parts of an image the algorithm finds relevant. The work could help medical professionals determine whether a patient needs an invasive—and often nerve-wracking—biopsy.
“If a computer is going to help make important medical decisions, physicians need to trust that the AI is basing its conclusions on something that makes sense,” Joseph Lo, professor of radiology at Duke and study coauthor said in a press release. “We need algorithms that not only work, but explain themselves and show examples of what they’re basing their conclusions on. That way, whether a physician agrees with the outcome or not, the AI is helping to make better decisions.”
One in every eight women in the US will develop invasive breast cancer during their lifetime. When detected early, a woman has a 93 percent or higher survival rate in the first 5 years.
Mammography, which uses low-energy X-rays to examine breast tissue for diagnosis and screening, is an effective tool for early detection, but requires a highly skilled radiologist to interpret the scans. However, false negatives and positives do occur, resulting in missed diagnosis and up to 40% of biopsied lesions being benign.
Using AI for medical imaging analysis has grown significantly in recent years and offers advantages in interpreting data. Implementing AI models also carries risks, especially when an algorithm fails.
“Our idea was to instead build a system to say that this specific part of a potential cancerous lesion looks a lot like this other one that I’ve seen before,” said study lead and Duke computer science Ph.D. candidate Alina Barnett. “Without these explicit details, medical practitioners will lose time and faith in the system if there’s no way to understand why it sometimes makes mistakes.”
Using 1,136 images from 484 patients within the Duke University Health System, researchers trained the algorithm to locate and evaluate potentially cancerous areas. This was accomplished by training the models to identify unhealthy tissue, or lesions, which often appear as bright or irregular shapes with fuzzy edges on a scan.
Radiologists then labeled these images, teaching the algorithm to focus on the fuzzy edges, also known as margins. Often associated with quick-growing cancerous breast tumor cells, margins are a strong indicator of cancerous lesions. With these carefully labeled images, the AI can compare cancerous and benign edges, and learn to distinguish between them.
The AI model uses the cuDNN-accelerated PyTorch deep learning framework and can be run on two NVIDIA P100 or V100 GPUs.
The researchers found the AI to be as effective as other machine learning-based mammography models, but it holds the advantage of having transparency in its decision-making. When the model is wrong, a radiologist can see how the mistake was made.
According to the study, the model could also be a useful tool when teaching medical students how to read mammogram scans and for resource-constrained areas of the world lacking cancer specialists.
The code from the study is available through GitHub.
Deep learning models require a lot of data to produce accurate predictions. Here’s how to solve the data processing problem for the medical domain with NVIDIA DALI.
Deep learning models require vast amounts of data to produce accurate predictions, and this need becomes more acute every day as models grow in size and complexity. Even large datasets, such as the well-known ImageNet with more than a million images, are not sufficient to achieve state-of-the-art results in modern computer vision tasks.
For this purpose, data augmentation techniques are required to artificially increase the size of a dataset by introducing random disturbances to the data, such as geometric deformations, color transforms, noise addition, and so on. These disturbances help produce models that are more robust in their predictions, avoid overfitting, and deliver better accuracy.
In medical imaging tasks, data augmentation is critical because datasets contain mere hundreds or thousands of samples at best. Models, on the other hand, tend to produce large activations that require a lot of GPU memory, especially when dealing with volumetric data such as CT and MRI scans. This typically results in training with small batch sizes on a small dataset. To avoid overfitting, more elaborate data preprocessing and augmentation techniques are required.
Preprocessing, however, often has a significant impact on the overall performance of the system. This is especially true in applications dealing with large inputs, such as volumetric images. These preprocessing tasks are typically run on the CPU due to simplicity, flexibility, and availability of libraries such as NumPy.
In some applications, such as segmentation or detection in medical images, the GPU utilization during training is usually suboptimal as data preprocessing is usually performed in the CPU. One of the solutions is to attempt to overlap data processing and training fully, but it is not always that simple.
Such a performance bottleneck leads to a chicken and egg problem. Researchers avoid introducing more advanced augmentations into their models due to performance reasons, and libraries don’t put the effort into optimizing preprocessing primitives due to low adoption.
GPU acceleration solution
You can improve the performance of applications with heavy data preprocessing pipelines significantly by offloading data preprocessing to the GPU. The GPU is typically underutilized in such scenarios but can be used to do the work that the CPU cannot complete in time. The result is better hardware utilization, and ultimately faster training.
This difference becomes more significant when you look at the NVIDIA submission for the MLPerf UNet3D benchmark. It used the same network architecture as in the BraTS21 winning solution but with a more complex data loading pipeline and larger input volumes (KITS19 dataset). The performance boost is an impressive 2x end-to-end training speedup when compared with the native pipeline (Figure 2).
This was made possible by NVIDIA Data Loading Library (DALI). DALI provides a set of GPU-accelerated building blocks, enabling you to build a complete data processing pipeline that includes data loading, decoding, and augmentation, and to integrate it with a deep learning framework of choice (Figure 3).
Volumetric image operations
Originally, DALI was developed as a solution for images classification and detection workflows. Later, it was extended to cover other data domains, such as audio, video, or volumetric images. For more information about volumetric data processing, see 3D Transforms or Numpy Reader.
DALI supports a wide range of image-processing operators. Some can also be applied to volumetric images. Here are some examples worth mentioning:
Resize
Warp affine
Rotate
Random object bounding box
To showcase some of the mentioned operations, we use a sample from the BraTS19 dataset, consisting of MRI scans labeled for brain tumor segmentation. Figure 4 shows a two-dimensional slice extracted from a brain MRI scan volume, where the darker region represents a region labeled as an abnormality.
Resize operator
Resize upscales or downscales the image to a desired shape by interpolating the input pixels. The upscale or downscale is configurable for each dimension separately, including the selection of the interpolation method.
Warp affine operator
Warp affine applies a geometric transformation by mapping pixel coordinates from source to destination with a linear transformation.
Warp affine can be used to perform multiple transformations (rotation, flip, shear, scale) in one go.
Rotate operator
Rotate allows you to rotate a volume around an arbitrary axis, provided as a vector, and an angle. It can also optionally extend the canvas so that the entire rotated image is contained in it. Figure 7 shows an example of a rotated volume.
Random object bounding box operator
Random object bounding box is an operator suited for detection and segmentation tasks. As mentioned earlier, medical datasets tend to be rather small, with target classes (such as abnormalities) occupying a comparatively small area. Furthermore, in many cases the input volume is much larger than the volume expected by the network. If you were to use random cropping windows for training, then the majority would not contain the target. This could cause the training convergence to slow down or bias the network towards false-negative results.
This operator selects pseudo-random crops that can be biased towards sampling a particular label. Connected component analysis is performed on the label map as a pre-step. Then, a connected blob is selected at random, with equal probability. By doing that, the operator avoids overrepresenting larger blobs.
You can also select to restrict the selection to the largest K blobs or specify a minimum blob size. When a particular blob is selected, a random cropping window is generated, within the range containing the given blob. Figure 8 shows this cropping window selection process.
The gain in learning speed can be significant. On the KITS19 dataset, nnU-Net achieves the same accuracy in 2134 in the test run epochs with the Random object bounding box operator as in 3,222 epochs with random crop.
Typically, the process of finding connected components is slow, but the number of samples in the data set can be small. The operator can be configured to cache the connected component information, so that it’s only calculated during the first epoch of the training.
Accelerate on your own
You can download the latest version of the prebuilt and tested DALI pip packages. The NGC containers for TensorFlow, PyTorch, and MXNet have DALI integrated. You can review the many examples and read the latest release notes for a detailed list of new features and enhancements.
See how DALI can help you accelerate data preprocessing for your deep learning applications. The best place to access is the NVIDIA DALI Documentation, including numerous examples and tutorials. You can also watch our GTC 2021 talk about DALI. DALI is an open-source project, and our code is available on the /NVIDIA/DALI GitHub repo. We welcome your feedback and contributions.
I have trained a model using tf1.15, converted it to tflite and compiled it for the edgetpu, which is all working. However, my confidence values for all bounding boxes is 50%. It seems to recognise what is what somewhat well, such as putting many boxes over and around the object it’s trying to detect, but they’re all 50% confidence so it is difficult to get a clean output. I believe the issue was not caused by any conversions, but during training, as tensorboard also shows this.
Browse through MORF Gallery — virtually or at an in-person exhibition — and you’ll find robots that paint, digital dreamscape experiences, and fine art brought to life by visual effects. The gallery showcases cutting-edge, one-of-a-kind artwork from award-winning artists who fuse their creative skills with AI, machine learning, robotics and neuroscience. Scott Birnbaum, CEO and Read article >