Edit: Sorry, should have read the rules first. Mods, if you take this down because its not tensorflow specific, I understand.
I’m just starting to play with neural networks, object detection, and tracking. I’m wondering what people use the confidence score of a detection for. Are there any common uses beyond simple confidence thresholding (i.e. output detection if conf > 0.5, otherwise dont)? Papers that use the confidence value in interesting ways are welcome!
For my own project, I was wondering how I might use the confidence score in the context of object tracking. For fun, and because its a super common application, i’ve been playing around with a traffic sign detector, and deploying it in a simulation. In the simulation, I get consistent and accurate predictions for real signs, and then frequent but short lived (i.e. 1-3 frame lifetime) false positives. I was thinking I could do some sort of tracking that uses the confidence values over a series of predictions to compute some kind of detection probability. I.e. if i look at a series of 30 frames, and in 20 i have 0.3 confidence of a detection, where the bounding boxes all belong to the same tracked object, then I’d argue there is more evidence that an object is there than if I look at a series of 30 frames, and have 2 detections that belong to a single object, but with a higher confidence e.g. conf=0.6. How can I leverage the confidence scores to create a more robust detection and tracking pipeline? Or am I already way off base (i’ve been trying to come up with a formula for how to do it, but probability and stochastics were never my strong suit and I know that the formulas I’ve been trying to write down implicitly assume independence, which I don’t know if that is the case here)?
Any way, how do you use the confidence values in your own projects?
I am trying to convert a pretrained model (Efficientnet) which I have trained on some custom images and new labels. But when using tf2onnx to convert it to onnx format it requires a checkpoint.meta file? But I can’t see this file anywhere? I only see a .index and .data file from the model when I have trained it.
Many deep learning models created using TensorFlow require high processing capabilities to perform inference. Fortunately, there is a lite version of TensorFlow called TensorFlow Lite (TFLite for short) which allows these models to run on devices with limited capabilities. Inference is performed in less than a second.
This tutorial will go through how to prepare Raspberry Pi (RPi) to run a TFLite model for classifying images. After that, the TFLite version of the MobileNet model will be downloaded and used for making predictions on-device.
Audiences are making a round-trip to the moon with a science documentary that showcases China’s recent lunar explorations. Fly to the Moon, a series produced by the China Media Group (CMG) entirely in NVIDIA Omniverse, details the history of China’s space missions and shares some of the best highlights of the Chang ‘e-4 lunar lander, Read article >
Researchers at The Ohio State University are aiming to take autonomous driving to the limit. Autonomous vehicles require extensive development and testing for safe widespread deployment. A team at The Ohio State Center for Automotive Research (CAR) is building a Mobility Cyber Range (MCR) — a dedicated platform for cybersecurity testing — in a self-driving Read article >
As explained in the Batch Normalization paper, training neural networks becomes way easier if its input is Gaussian. This is clear. And if your model inputs are not Gaussian, RAPIDS will just transform it to Gaussian in the blink of an eye. Gauss rank transformation is a novel standardization technique to transform input data for … Continued
As explained in the Batch Normalization paper, training neural networks becomes way easier if its input is Gaussian. This is clear. And if your model inputs are not Gaussian, RAPIDS will just transform it to Gaussian in the blink of an eye.
Input normalization is critical for training neural nets. The idea of Gauss rank transformation was first introduced by Michael Jahrer in his winning solution of Porto Seguro’s Safe Driver Prediction challenge. He trained denoising auto-encoders and experimented with several input normalization methods. In the end, he drew this conclusion:
The best thing I found during the past and works straight out of the box is GaussRank. This works usually much better than standard mean/std scaler or min/max (normalization).
How it works
There are three steps to transform a vector of continuous values under arbitrary distribution to Gaussian distribution based on ranks, as shown in the figure 1.
The CuPy implementation is straightforward and remarkably resembles NumPy operations. In fact, it is as simple as changing the imported functions to move the whole process from CPU to GPU without any other code changes.
Inverse Transformation is used to restore the original values from the Gaussian transformations. This is another great example to show the interoperability of cuDF and CuPy. Just like you can do with NumPy and Pandas, you can weave cuDF and CuPy together in the same workflow while keeping the data entirely on the GPU.
A real-world example
For this example, we will use the CHAMPS molecular properties prediction dataset. The task is to predict scalar coupling constants (the ground truth)between atom pairs in molecules for eight different chemical bond types. What’s challenging is that the distribution of the ground truth differs significantly for each bond type, with varied mean and variance. This makes it difficult for the neural network to converge.
Hence, we applied the Gauss rank transformation to the ground truths of training data to create one unified clean Gaussian distribution for all bond types.
In this regression task, ground truths of training data are transformed using GaussRank.
For inference, we applied the inverse Gauss rank transformation to the predictions of the test data so that they match the original different distributions for each bond type. Since the true distribution of targets of the test data is unknown, the inverse transformation of the predictions of the test data is calculated based on the distribution of target variables in training data. It should be noted that such an inverse transformation is only needed for target variables.
Keep in mind that GaussRank does have some limitations:
It works only for continuous variables, and
if the input is already close to Gaussian, or very asymmetrical, performance might not be improved or even become worse.
The interplay between gauss rank transformation and different kinds of neural networks is an active research topic.
Speedup
We measure the total time of transformation and the inverse transformation. For the proceeding CHAMPS dataset, the cuDF+CuPy implementation on a single NVIDIA V100 GPU achieves 25x speedup over the Pandas+NumPy implementation on an Intel Xeon CPU. We generate synthetic random data for a more comprehensive comparison. For 10M data points and more, our RAPIDS implementation is more than 100x faster.
Conclusion
RAPIDS has come a long way to deliver stunning performance with little or no code changes. This blog post showcases how easy it is to use RAPIDS cuDF and CuPy as drop-in replacements to Pandas and NumPy to realize the performance improvements on GPUs. As shown inthe full notebook, by adding just two lines of code, the Gauss rank transformation detects the input tensor is on GPU and automatically switches to cuDF+CuPy from Pandas+NumPy. It can’t be much easier than that.
Up until a week ago, I had no problem using the apple provided TF version for the new M1 Macs. 2 Days ago the Repository was archived and Apple published new instruction to using TF with the M! Macs. The previous TF for mac Version stopped working at the same time, unfortunately the new Version which I just installed trained my simple MNIST Model 15x slower. Does anyone know why this is or if this is a Bug?
Hey all, I trained a “RoastBot” a while ago using a dataset I scraped from /r/RoastMe. The inputs are images of people and the outputs are high rated comments that are “roasts” of the people.
I use Inceptionv3 to preprocess the images into latent vectors, and then I use a recurrent decoder with visual attention to create the sequences. This works good enough to come up with something decent every now and again, but the model just seems like it would do better if it started the training process already knowing about grammar and syntax.
I was thinking I could replace my decoder with a pre-trained BERT model, but BERT and any other transformer models only take text as input, right? I think at least BERT preprocesses the text, I’m not sure how though.
My latent tensors are of shape (8,8,2048), and I imagine that the input text tensors for BERT are (num_tokens, 1). I guess I can flatten my tensor to be of shape (882048, 1), but also I don’t know if BERT will even do a good job going from image data to text…
If I could find a large model for image captioning that would be perfect for fine-tuning, but I don’t think it exists.
I have the following problem statement in which I only need to predict whether a given image is an apple or not. For training only 8 images are provided with the following details:
apple_1 image – 2400×1889 PNG
apple_2 image – 641×618 PNG
apple_3 image – 1000×1001 PNG
apple_4 image – 500×500 PNG contains a sticker on top of fruit
apple_5 image – 2400×1889 PNG
apple_6 image – 1000×1000 PNG
apple_7 image – 253×199 JPG
apple_8 image – 253×199 JPG
I am thinking about using Transfer learning: either VGG or ResNet-18/34/50. Maybe ResNet is an overkill for this problem statement? How do I deal with such varying image sizes and of different file extensions (PNG, JPG)?
Any online code tutorial will be helpful. I found this example code online.