Categories
Misc

Hatch Me If You Can: Startup’s Sorting Machines Use AI to Protect Healthy Fish Eggs

Fisheries collect millions upon millions of fish eggs, protecting them from predators to increase fish yield and support the propagation of endangered species — but an issue with gathering so many eggs at once is that those infected with parasites can put healthy ones at risk. Jensorter, an Oregon-based startup, has created AI-powered fish egg Read article >

The post Hatch Me If You Can: Startup’s Sorting Machines Use AI to Protect Healthy Fish Eggs appeared first on The Official NVIDIA Blog.

Categories
Misc

EarlyStopping: ‘patience’ count is reset when tuning in Keras

I’m using keras-tuner to perform a hyperparameter optimization of a neural network.

I’m using a Hyperband optimization, and I call the search method as:

import keras_tuner as kt tuner = kt.Hyperband(ann_model, objective=Objective('val_loss', direction="min"), max_epochs=100, factor=2, directory=/path/to/folder, project_name="project_name", seed=0) tuner.search(training_gen(), epochs=50, validation_data=valid_gen(), callbacks=[stop_early], steps_per_epoch=1000, validation_freq=1, validation_steps=100) 

where the EarlyStopping callback is defined as:

stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=0.1, mode='min', patience=15) 

Hyperband initially trains many models (each one with a different combination of the hyperparameters previously chosen) for only 2 epochs; then, it discards poor performing models and it only trains the most promising ones, step by step, with an increasing number of epochs at each step (the final goal is to discard all models except one, the best perfoming one).

So the training of a specific model is not performed in one shot, but it’s perfomed by steps, where in each of them Keras saves the state of the training.

By setting max_epochs=100, I noticed that the training of a model is performed by these steps (called “Runnning trials“):

  1. firstly, from epoch 1 to epoch 3;
  2. secondly, from 4 to 7;
  3. then, from 8 to 13;
  4. then, from 14 to 25;
  5. then, from 26 to 50;
  6. and finally, from 51 to 100.

So, at the end of each “Running trial”, Keras saves the state of the training, in order to continue, at the next “Running trial”, the training from that state.

By setting patience=15: during “Runnning trials” 1), 2), 3), 4) of the list above, EarlyStopping could not operate because the number of training epochs is less than patience; thus, EarlyStopping could operate only during “Running trials” 5) and 6) of the list above.

Initially I thought that the patience count started at epoch 1 and should never reset itself when a new “Running trial” begins, but I noticed that the EarlyStopping callback stops the training at epoch 41, thus during the “Running trial” 5), which goes from epoch 26 to 50 .
Thus it seems to me that, at the beginning of each “Running trial”, patience count is reset; indeed: EarlyStopping arrests the training at epoch 41, the first epoch at which EarlyStopping is able to operate, because: start_epoch + patience = 26 + 15 = 41..

Is it normal/expected behavior that patience is automatically reset at the beginning of each “Running trial” while using Keras Hyperband tuning?

submitted by /u/RainbowRedditForum
[visit reddit] [comments]

Categories
Misc

UK Biobank Advances Genomics Research with NVIDIA Clara Parabricks

UK Biobank is broadening scientists’ access to high-quality genomic data and analysis by making its massive dataset available in the cloud alongside NVIDIA GPU-accelerated analysis tools. Used by more than 25,000 registered researchers around the world, UK Biobank is a large-scale biomedical database and research resource with deidentified genetic datasets, along with medical imaging and Read article >

The post UK Biobank Advances Genomics Research with NVIDIA Clara Parabricks appeared first on The Official NVIDIA Blog.

Categories
Misc

A beginner question

tf.flags.DEFINE_string('config', '', 'Path to the file with configurations') 

What does this mean? What will be a better document to learn the basics of TF?

submitted by /u/Admirable-Study-626
[visit reddit] [comments]

Categories
Misc

Text prediction project – Finally managed to break the plateau by increasing the prob. of keeping weights. But I’m damn sure it will plateau again at some point, maybe the only thing could be changed at this point is the learning rate?

Text prediction project - Finally managed to break the plateau by increasing the prob. of keeping weights. But I'm damn sure it will plateau again at some point, maybe the only thing could be changed at this point is the learning rate? submitted by /u/Smsm1998
[visit reddit] [comments]
Categories
Misc

What returns on model.evaluate??

What returns on model.evaluate??

This may seem like a basic question but I am a little confused. In this example picture I call model.evaluate. The returned message says the batch_loss is 2.1455. But I had a callback keep track of the batch loss and if I calculate the mean of the batch_loss it is 2.0925, and 2.1455 is not seen in any of the callback loss values. So… What is evaluating returning here?

https://preview.redd.it/k2l2bt2pdzd81.png?width=536&format=png&auto=webp&s=7ec27ea483283e721c7053cdbf41ef0a22d3ff64

This is a custom model so it could be my implementation?? but it’s weird that it is so close but off.

Thanks!

submitted by /u/Dylan_TMB
[visit reddit] [comments]

Categories
Misc

Values of gradients to be returned

If I am making a function that has two inputs, what values is `tape.gradients` expecting?

For example, if I input data in a batch size to the function `f(x,y)`, the `tape.gradient` expects both of the returned gradients to be of shape (batch size, 2), however I’m not sure what should be the contents of those Tensors. For a batch size of 3, I thought it would look something like the following:

df/dx (x_1, y_1) df/dy (x_1, y_1)
df/dx (x_2, y_2) df/dy (x_2, y_2)
df/dx (x_3, y_3) df/dy (x_3, y_3)

Where the above table is a Tensor, but i have no idea what other tensor it could want for the second variable. Does anyone have an idea of what it should be? Thanks

submitted by /u/soravoid
[visit reddit] [comments]

Categories
Misc

NVIDIA Nsight Systems 2022.1 Introduces Vulkan 1.3 and Linux Backtrace Sampling and Profiling Improvements

The latest Nsight Systems 2022.1 release introduces several improvements aimed to enhance the profiling experience.

The latest update to NVIDIA Nsight Systems—a performance analysis tool designed to help developers tune and scale software across CPUs and GPUs—is now available for download. Nsight Systems 2022.1 introduces several improvements aimed to enhance the profiling experience. 

Nsight Systems is part of the powerful debugging and profiling NVIDIA Nsight Tools Suite. A developer can start with Nsight Systems for an overall system view and avoid picking less efficient optimizations based on assumptions and false-positive indicators. 

2022.1 highlights 

  • Vulkan 1.3 support.
  • System-wide CPU backtrace sampling and CPU context switch tracing on Linux.
  • NVIDIA NIC metrics sampling improvements.
  • MPI trace improvements.
  • Improvements for remote profiling over SSH. 

With Vulkan 1.3, you now have access to nearly two dozen new extensions. Some extensions, like VK_KHR_dynamic_rendering, help you to simplify your code while improving readability.

Other extensions, such as VK_KHR_shader_integer_dot_product or VK_EXT_pipeline_creation_cache_control, provide new functionality to help you build even better graphics applications. 

This release of Nsight Systems includes support for Vulkan 1.3 that helps you solve real world problems quickly and easily.

Figure 1. Vulkan 1.3 and ray tracing leadership video

The system-wide CPU thread context switch trace and backtrace sampling feature on Linux. Users will now be able to see if other apps, OS processes and kernel might be interfering with the processes you are profiling.

Workflow of backtrace sampling and CPU context switch tracing on Linux
Figure 2. System-wide CPU backtrace sampling and CPU context switch tracing on Linux.

More information 

Categories
Misc

Overcoming Data Collection and Augmentation Roadblocks with NVIDIA TAO Toolkit and Appen Data Annotation Platform

Generating and labeling data to train AI models is time-consuming. Appen, helps label and annotate your data, which can then be used as inputs in the TAO Toolkit.

Building AI models from scratch requires enormous amounts of data, time, money, and expertise. This is at odds with what it takes to succeed in the AI space: fast time-to-market and the ability to quickly evolve and customize solutions. NVIDIA TAO, an AI-Model-Adaptation framework, enables you to leverage production-quality, pretrained AI models and fine-tune them in a fraction of the time compared to training from scratch.

To fine-tune these models further, or confirm the precision of your model, additional high-quality training data is required. Appen, a data annotation partner for TAO, provides access to high-quality datasets and services to label and annotate your data for your unique needs, if you don’t have the right data available.

In the post, I show you how you can use the NVIDIA TAO Toolkit, a CLI-based solution of the NVIDIA TAO framework, along with Appen’s data labeling platform to simplify the overall training process and create highly customized models for a particular use case.

After your team has identified a business problem to solve using ML, you can select from NVIDIA collection of pretrained AI models in computer vision and conversational AI. Computer vision models can include face detection models, text recognition, segmentation, and more. Then you can apply the TAO Toolkit to build, train, test, and deploy your solution.

To speed up the data collection and augmentation process, you can now use the Appen Data Annotation Platform to create the right training data for your use case. The robust platform enables you to access Appen’s global crowd of over one million skilled annotators from over 170 countries, speaking 235 languages. Appen’s data annotation platform and expertise also provide you with other resources:

  • High-quality datasets (for when you need data)
  • Human labelers sourced globally to annotate your unlabeled data
  • An easy-to-use platform where you can launch annotation jobs and monitor key metrics
  • Quality assurance checks and data security controls

With clean, high-quality data, you can adapt pretrained NVIDIA models to suit your requirements, pruning, and retraining to achieve the performance level that you need.

How to prepare your data using Appen’s platform

If you don’t already have data to use for training your model, you can either collect that data yourself or turn to Appen to source datasets that suit your use cases. The Appen Data Annotation Platform (ADAP) works with a diverse set of formats:

  • Audio (.wav, .mp3)
  • Image (.jpeg, .png)
  • Text (.txt)
  • Video (URL)

When you’re done with the data collection phase, unless you plan to work with Appen for your data collection needs, you can use Appen’s platform to quickly label the data you’ve collected. You need an Appen platform license and budget for each row of data annotation.

From there, work through the following steps to deploy a model that works specifically for your needs. For the purposes of this post, assume that you’re annotating images for an object detection model.

Prepare your data

First, load your image data into a web-accessible location: the cloud or a location that ADAP can access, such as a private Amazon S3 bucket.

Next, structure your input CSV file with two columns. The first column contains the filenames, and the second includes URLs to the images. You can provide the URLs in one of three ways:

  • Use publicly available URLs for your data.
  • Use presigned URLs.
  • Use Appen’s Secure Data Access tool, which you can use to attach your database securely to the platform; Appen only accesses your data when needed.

The second column contains the local file name on your device. Figure 1 shows what your CSV file may look like.

Three columns of data (labels, filename, and image_url), five rows labeled 0 to 4 with names and URLs.
Figure 1. CSV structure for data upload in ADAP

Create your job and upload data

If you haven’t already, you can create an ADAP account and sign in. You must have an active license for the platform before running new jobs. To learn more about plans and pricing, contact Appen.

After logging in, under Jobs, choose Create a Job.

A window with four job rows, job titles, ids, rows, cost, date created. In the header, there is a search box and three filters, as well as the Create Job button.
Figure 2. ADAP jobs overview page

Select the template that best fits the job (sentiment analysis, search relevance, and so on). For this example, choose Image Annotation.

Prebuilt annotation job templates sorted by use case, with filters and search. Current view is filtered on Image annotation templates. There is a start from scratch button available.
Figure 3. ADAP job templates page – Image annotation

Under Image Annotation, choose Annotate and Categorize Objects in an Image Using a Bounding Box. Upload your CSV file by dragging and dropping it into the Upload tab.

Design your job

Provide guidelines for Appen’s crowd of over one million data labelers on what they should be looking for and any requirements that they should be aware of. The template provides a simple job design to help you get started.

Next, choose Manage the Image Annotation Ontology, where you’ll define the classes that should be detected. Update the instructions to give more context about your use case and describe how annotators should identify and label objects in images. You can preview your job and see how an annotator will view it.

Finally, create test questions to measure and track labeler performance.

Launch job

Do a test run first before officially launching your annotation job on the platform. After you’ve launched your job, Appen’s global crowd of data labelers annotate your data to your specifications.

Monitor

Monitor the accuracy rate of annotations in real time. Adjust as needed in areas like job design, test questions, or annotators.

A page that analyzes annotation job progress by displaying charts about completion rate, rows finalized, pending judgements, total costs and rows uploaded.
Figure 8. ADAP annotation progress monitoring page

Results

Download a report of your labeled data output by choosing Download, Full.

Convert output to KITTI format

From here, you need a script to convert your labeled data to a format that’s feedable to the TAO Toolkit, such as KITTI format.

Using the output from the previous step, you can use the following section to convert your labeled data to a format like the Pascal Visual Object Classes (VOC) format. For the full code and guide on how to convert your data, see the /Appen/public-demos GitHub repo.

Training your model

Your data annotated with Appen can now be used to train your object detection model. The TAO Toolkit allows you to train, fine-tune, prune, and export highly optimized and accurate AI models for deployment by adapting popular network architectures and backbones to your data. For this example, you can choose a YOLOV3 Object Detection model, as in the following example:

First, download the notebook.

$ wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/tlt_cv_samples/versions/v1.0.2/zip -O tlt_cv_samples_v1.0.2.zip

$ unzip -u tlt_cv_samples_v1.0.2.zip  -d ./tlt_cv_samples_v1.0.2 && rm -rf tlt_cv_samples_v1.0.2.zip && cd ./tlt_cv_samples_v1.0.2

After the notebook samples are downloaded, you may start the notebook using the following commands:

$ jupyter notebook --ip 0.0.0.0 --port 8888 --allow-root

Open the internet browser on localhost and open the following URL:

http://0.0.0.0:8888

Because you are creating a YOLOv3 model, open the yolo_v3/yolo_v3.ipynb notebook. Follow the notebook instructions to train the model.

Based on results, fine-tune the model until it achieves your metric goals. If desired, you can create your own active learning loop at this stage. Prioritize data based on confidence or another selection metric using the CSV file method as described in the preceding steps. You can also load data (including inputs and predictions) beforehand so Appen’s annotators can validate the model after it’s been trained, reviewing the predictions using our domain experts and open crowd.

Pro tip: Use Workflows, an Appen Solution, to build and automate multistep data annotation projects with ease.

Iterate

Appen can further assist you with data collection and annotation in subsequent rounds of model training as you iteratively improve on your model performance. To avoid model drift or to accommodate changing business requirements, retrain your model regularly.

Conclusion

NVIDIA TAO Toolkit combined with Appen’s data platform enables you to train, fine-tune, and optimize pretrained models to get your AI solutions off the ground faster. Speed up your development times by tenfold without sacrificing quality. With the help of integrated expertise and tools from NVIDIA and Appen, you’ll be ready to launch AI with confidence.

For more information, see the following resources:

Categories
Misc

Does the MeanSquaredError of a batch return its average MSE?

I’m looking at the MSE source documentation for help making a custom loss function, and it seems that MSE will return a single loss value no matter what, not a loss value per pair of y_true and y_pred. Could anyone help me understand this?

submitted by /u/soravoid
[visit reddit] [comments]