Categories
Misc

Looking for help install tensorflow object detection API

Been struggling to get the tensorflow object detection api installed on windows. I’m willing to pay anyone who is able to help me successfully install it. Assuming I don’t figure it out on my own.

submitted by /u/Ok_Wish4469
[visit reddit] [comments]

Categories
Misc

TensorFlow and Keras 2.9

New TensorFlow and Keras releases bring improvements big and small.

Categories
Offsites

TensorFlow and Keras 2.9

New TensorFlow and Keras releases bring improvements big and small.

Categories
Offsites

LIMoE: Learning Multiple Modalities with One Sparse Mixture of Experts Model

Sparse models stand out among the most promising approaches for the future of deep learning. Instead of every part of a model processing every input (“dense” modeling), sparse models employing conditional computation learn to route individual inputs to different “experts” in a potentially huge network. This has many benefits. First, model size can increase while keeping computational cost constant — an effective and environmentally friendlier way to scale models, which is often key to high performance. Sparsity also naturally compartmentalizes neural networks. Dense models that learn many different tasks simultaneously (multitask) or sequentially (continual learning) often suffer negative interference, where too much task variety means it is better to just train one model per task, or catastrophic forgetting, where the model becomes worse at earlier tasks as new ones are added. Sparse models help avoid both these phenomena — by not applying the whole model to all inputs, “experts” in the model can specialize on different tasks or data types while still taking advantage of shared parts of the model.

Research on sparsity has long been pursued at Google Research. Pathways summarizes the research vision of building one single large model that diligently handles thousands of tasks and numerous data modalities. So far there has been considerable progress in sparse unimodal models for language (Switch, Task-MoE, GLaM) and computer vision (Vision MoE). Today, we take another important step towards the Pathways vision by studying large sparse models that simultaneously handle images and text with modality-agnostic routing. A relevant approach is multimodal contrastive learning, which requires a solid understanding of both images and text in order to align pictures with their correct text description. The strongest models that tackle this task to date rely on independent networks for each modality (a “two-tower” approach).

In “Multimodal Contrastive Learning with LIMoE: the Language Image Mixture of Experts”, we present the first large-scale multimodal architecture using a sparse mixture of experts. It simultaneously processes both images and text, but uses sparsely activated experts that naturally specialize. On zero-shot image classification, LIMoE outperforms both comparable dense multimodal models and two-tower approaches. The largest LIMoE achieves 84.1% zero-shot ImageNet accuracy, comparable to more expensive state-of-the-art models. Sparsity enables LIMoE to scale up gracefully and learn to handle very different inputs, addressing the tension between being a jack-of-all-trades generalist and a master-of-one specialist.

The LIMoE architecture contains many “experts” and routers decide which tokens (parts of an image or sentence) go to which experts. After being processed by expert layers (gray) and shared dense layers (brown), a final output layer computes a single vector representation for either an image or a text.

Sparse Mixture of Expert Models
Transformers represent data as a sequence of vectors (or tokens). Though originally developed for text, they can be applied to most things that are representable as a sequence of tokens, e.g., images, videos, and audio. Recent large-scale MoE models add expert layers to the Transformer architecture (e.g., gShard and ST-MoE in natural language processing, and Vision MoE for vision tasks).

A standard Transformer consists of many “blocks”, each containing various different layers. One of these layers is a feed-forward network (FFN). For LIMoE and the works cited above, this single FFN is replaced by an expert layer that contains many parallel FFNs, each of which is an expert. Given a sequence of tokens to process, a simple router learns to predict which experts should handle which tokens. Only a small number of experts are activated per token, meaning although the model capacity is significantly increased by virtue of having so many experts, the actual computational cost is controlled by using them sparsely. If only one expert is activated, the model’s cost is roughly equivalent to the standard Transformer model.

LIMoE does precisely that, activating one expert per example, thereby matching the computational cost of the dense baselines. What’s different is that the LIMoE router might see tokens of either image or text data.

A unique failure mode of MoE models occurs when they try to send all tokens to the same expert. Typically this is addressed with auxiliary losses, extra training objectives that encourage balanced expert usage. We found that dealing with multiple modalities interacted with sparsity to cause new failure modes that existing auxiliary losses could not address. To overcome this, we developed new auxiliary losses (more details in the paper) and used routing prioritization (BPR) during training, two innovations that resulted in stable and high performance multimodal models.

The new auxiliary losses (LIMoE aux) and routing prioritization (BPR) stabilized and improved overall performance (left) and increased the success rate of routing behavior (middle and right). A low success rate means the router does not use all the experts available and drops many tokens due to individual expert capacity being reached, which usually indicates the sparse model is not learning well. The combination introduced for LIMoE ensures high routing success rates for both images and text and consequently leads to significantly better performance.

Contrastive Learning with LIMoE
In multimodal contrastive learning, models are trained on paired image-text data (e.g., a photo and its caption). Typically, an image model extracts a representation of images, and a different text model extracts a representation of text. The contrastive learning objective encourages the image and text representations to be close for the same image-text pair and far away for content from different pairs. Such models with aligned representations can be adapted to new tasks without extra training data (“zero-shot”), e.g., an image will be classified as a dog if its representation is closer to the representation of the word “dog” than the word “cat”. This idea scales to thousands of classes and is referred to as zero-shot image classification.

CLIP and ALIGN (both two-tower models) scaled this process to achieve 76.2% and 76.4% zero-shot classification accuracy on the popular ImageNet dataset. We study one-tower models which compute both image and text representations. We find this reduces performance for dense models, likely due to negative interference or insufficient capacity. However, a compute-matched LIMoE not only improves over the one-tower dense model, but also outperforms two-tower dense models. We trained a series of models in a comparable training regimen to CLIP. Our dense L/16 model achieves 73.5% zero-shot accuracy, whereas LIMoE-L/16 gets to 78.6%, even outperforming CLIP’s more expensive, two-tower L/14 model (76.2%). As shown below, LIMoE’s use of sparsity provides a remarkable performance boost over dense models with equivalent cost.

For a given compute cost (x-axis), LIMoE models (circles, solid line) are significantly better than their dense baselines (triangles, dashed line). The architecture indicates the size of the underlying transformer, increasing from left (S/32) to right (L/16). Following standard convention, S (small), B (base), and L (large) refer to model scale. The number refers to the patch size, where smaller patches imply a larger architecture.

LiT and BASIC pushed zero-shot accuracy for dense two-tower models to 84.5% and 85.6% respectively. In addition to scaling, these approaches made use of specialized pre-training methods, repurposing image models that were already of exceptionally high quality. LIMoE-H/14 does not benefit from any pre-training or modality-specific components, but still achieved a comparable 84.1% zero-shot accuracy training from scratch. The scale of these models is also interesting to compare: LiT and BASIC are 2.1B and 3B parameter models. LIMoE-H/14 has 5.6B parameters in total, but via sparsity it only applies 675M parameters per token making it significantly more lightweight.

Data seen during training
Model   Pre-training     Image-text     Total      Parameters per token     ImageNet accuracy  
CLIP 12.8B 12.8B ~200M 76.2%
ALIGN 19.8B 19.8B ~410M 76.4%
LiT 25.8B 18.2B 44.0B 1.1B 84.5%
BASIC 19.7B 32.8B 52.5B 1.5B 85.6%
LIMoE H/14    23.3B 23.3B 675M 84.1%

Understanding LIMoE’s Behavior
LIMoE was motivated by the intuition that sparse conditional computation enables a generalist multimodal model to still develop the specialization needed to excel at understanding each modality. We analyzed LIMoE’s expert layers and uncovered a few interesting phenomena.

First, we see the emergence of modality-specialized experts. In our training setup there are many more image tokens than text tokens, so all experts tend to process at least some images, but some experts process either mostly images, mostly text, or both.

Distributions for an eight expert LIMoE; percentages indicate the amount of image tokens processed by the expert. There are one or two experts clearly specialized on text (shown by the mostly blue experts), usually two to four image specialists (mostly red), and the remainder are somewhere in the middle.

There are also some clear qualitative patterns among the image experts — e.g., in most LIMoE models, there is an expert that processes all image patches that contain text. In the example below, one expert processes fauna and greenery, and another processes human hands.

LIMoE chooses an expert for each token. Here we show which image tokens go to which experts on one of the layers of LIMoE-H/14. Despite not being trained to do so, we observe the emergence of semantic experts that specialize in specific topics such as plants or wheels.

Moving Forward
Multimodal models that handle many tasks are a promising route forward, and there are two key ingredients for success: scale, and the ability to avoid interference between distinct tasks and modalities while taking advantage of synergies. Sparse conditional computation is an excellent way of doing both. It enables performant and efficient generalist models that also have the capacity and flexibility for the specialization necessary to excel at individual tasks, as demonstrated by LIMoE’s solid performance with less compute.

Acknowledgements
We thank our co-authors on this work: Joan Puigcerver, Rodolphe Jenatton and Neil Houlsby. We also thank Andreas Steiner, Xiao Wang and Xiaohua Zhai, who led early explorations into dense single-tower models for contrastive multimodal learning, and also were instrumental in providing data access. We enjoyed useful discussions with André Susano Pinto, Maxim Neumann, Barret Zoph, Liam Fedus, Wei Han and Josip Djolonga. Finally, we would also like to thank and acknowledge Tom Small for the awesome animated figure used in this post.

Categories
Misc

Can anyone let me know how I can split a saved_model into several tflite files

Hello,

I am trying to split my saved_model.ob into several tflite files. Any suggestions how to do this?

Thanks, Susan

submitted by /u/Playful_Buddy9042
[visit reddit] [comments]

Categories
Misc

Out of This World: ‘Mass Effect Legendary Edition’ and ‘It Takes Two’ Lead GFN Thursday Updates

Some may call this GFN Thursday legendary as Mass Effect Legendary Edition and It Takes Two join the GeForce NOW library. Both games expand the available number of Electronic Arts games streaming from our GeForce cloud servers, and are part of 10 new additions this week. Adventure Awaits In The Cloud Relive the saga of Read article >

The post Out of This World: ‘Mass Effect Legendary Edition’ and ‘It Takes Two’ Lead GFN Thursday Updates appeared first on NVIDIA Blog.

Categories
Offsites

Summer of Math Exposition #2

Categories
Misc

Can you define a custom loss function that is different for each observation?

Probably a stupid question, but I am a complete noob with tensorflow (and haven’t used python in years) and I’m trying to see if I can use it to solve a very specific problem that I have. Basically, I am running a multivariate regression and the loss function is supposed to vary depending on the observation.

For example, assume that for each row i of the data frame the model outputs a vector y_hat with 5 values, and the true vector for that observation is y. The “default” loss function would be the L1 norm of the error:

sum([abs(y[k] - y_hat[k]) for k in range(0,5)]) 

But I would like to add a penalty depending on i, like this:

sum([abs(y[k] - y_hat[k]) for k in range(0,5)]) + abs(y[i%5] - y_hat[i%5]) 

Is this even possible? I was thinking that maybe I could somehow modify the output layer such that it outputs a vector with 6 values, making sure that the first value is always equal to i and the remaining 5 values are equal to y_hat, but I don’t know how to do that either.

Sorry for the rambling, any help would be greatly appreciated!

submitted by /u/fr-pscz
[visit reddit] [comments]

Categories
Misc

Getting AI Applications Ready for Cloud-Native

Cloud-native is one of the most important concepts associated with deploying edge AI applications. Find out how to get AI applications cloud-native ready.

Sign up for Edge AI News to stay up to date with the latest trends, customers use cases, and technical walkthroughs.

Cloud-native is one of the most important concepts associated with edge AI. That’s because cloud-native delivers massive scale for application deployments. It also delivers performance, resilience, and ease of management, all critical capabilities for edge AI. Cloud-native and edge AI are so entwined that we believe the future of edge AI is cloud-native

This post is an overview of cloud-native components and the steps to get an application cloud-native ready. I show you how to practice using these concepts with NVIDIA Fleet Command, a cloud service for deploying and managing applications at the edge built with cloud-native principles. 

If all these steps are followed, the result is a cloud-native application that can be easily deployed on Fleet Command and other cloud-native deployment and management platforms. 

What is cloud-native? 

Cloud-native is an approach to developing and operating applications that encompass the flexibility, scalability, and resilience of the cloud computing delivery model. The cloud-native approach allows organizations to build applications that are resilient and manageable, allowing more agile application deployments. 

There are key principles of cloud-native development:

  • Microservices
  • Containers
  • Helm charts
  • CI/CD
  • DevOps

What are microservices? 

Microservices are a form of software development where an application is broken down into smaller, self-contained services that communicate with each other. These self-contained services are independent, meaning each of them can be updated, deployed, and scaled on their own without affecting the other services in the application. 

Microservices make developing applications faster and the process of updating and deploying and scaling those updates easier. 

What are containers?

A container is a software package that contains all of the information and dependencies necessary to run an application reliably in any computing environment. Containers can be easily deployed on different operating systems and offer portability and application isolation.  

A whole application can be containerized, but pieces of an application can be containerized as well. For instance, containers work extremely well with microservices where applications are broken into small, self-sufficient components. Each microservice can be packaged, deployed, and managed in containers). Additionally, multiple containers can be deployed and managed in clusters. 

Containers are perfect for edge deployments because they enable you to install your application, dependencies, and environment variables one time into the container image, rather than on each system that the application runs on, making managing multiple deployments significantly easier.

That’s important for edge commuting because an organization may need to install and manage hundreds or thousands of different deployments across a vast physical distance, so automating as much of the deployment process as possible is critical. 

What are Helm charts?

For complex container deployments, such as deploying multiple applications across several sites with multiple systems, many organizations use Helm charts. Helm is an application package manager running on top of Kubernetes (discussed later). Without it, you have to manually create separate YAML files for each workload that specifies all the details needed for a deployment, from pod configurations to load balancing.

Helm charts eliminate this tedious process by allowing organizations to define reusable templates for deployments, in addition to other benefits like versioning and the capability to customize applications mid-deployment.

What is CI/CD?

Continuous integration (CI) enables you to iterate and test new code collaboratively, usually by integrating it into a shared repository.

Continuous delivery (CD) is the automated process of taking new builds from the CI phase and loading them into a repository where they can easily be deployed into production.

A proper CI/CD process enables you to avoid disruptions in service when integrating new code into existing solutions. 

What is DevOps?

The term DevOps refers to the process of merging developer and operations groups to streamline the process for developing and delivering applications to customers.

DevOps is important for cloud-native technologies, as the philosophies of both concepts are focused on delivering solutions to customers continuously and easily, and creating an end-to-end development pipeline to accelerate updates and iterations. 

What is cloud-native management?

Now that the core principles of cloud-native have been explained, it is important to discuss how to manage cloud-native applications in production. 

The leading platform for orchestrating containers is Kubernetes. Kubernetes is open source and allows organizations to deploy, manage, and scale containerized applications.

Several organizations have built enterprise-ready solutions on top of Kubernetes to offer unique benefits and functionality:

The process for getting an application ready for any Kubernetes platform, whether Kubernetes itself or a solution built on top of Kubernetes, is essentially the same. Each solution has specific configuration steps needed to ensure an organization’s cloud-native applications can run effectively without issue.

Deploying a cloud-native application with NVIDIA Fleet Command

This section walks through the configuration process by using NVIDIA Fleet Command as an example and noting the specific configurations needed. 

Step 0: Understand Fleet Command

Fleet Command is a cloud service for managing applications across disparate edge locations. It’s built on Kubernetes and deploys cloud-native applications, so the steps to get an application onto Fleet Command are the same steps to get an application onto other cloud-native management platforms.

Assuming the application is already built, there are just four steps to get that application onto Fleet Command:

  • Containerize the application
  • Determine the application requirements
  • Build a Helm chart
  • Deploy on Fleet Command

Step 1: Containerize the application

Fleet Command deploys applications as containers. By using containers, you can deploy multiple applications on the same system and also easily scale the application across multiple systems and locations. Also, all the dependencies are packed inside of the container, so you know that the application will perform the same across thousands of systems. 

Building a container for an application is easy. For more information, see the Docker guide on containers

Here’s an example of a Dockerfile for a customized deep learning container built using an NVIDIA CUDA base image:

FROM nvcr.io/nvidia/cuda:11.3.0-base-ubuntu18.04
CMD nvidia-smi

#set up environment
RUN apt-get update && apt-get install --no-install-recommends --no-install-suggests -y curl
RUN apt-get install unzip
RUN apt-get -y install python3
RUN apt-get -y install python3-pip

#copies the application from local path to container path
COPY app/ /app/
WORKDIR /app

#Install the dependencies
RUN pip3 install -r /app/requirements.txt

ENV MODEL_TYPE='EfficientDet'
ENV DATASET_LINK='HIDDEN'
ENV TRAIN_TIME_SEC=100

CMD ["python3", "train.py"]

In this example, /app/ contains all the source code. After a Dockerfile is created for the container, a container can be built using the file and then uploaded to a private registry in the cloud so that the container can be easily deployed anywhere.

Step 2: Determine the application requirements

When the container is complete, it is necessary to determine what the application requires to function properly. This typically involves considering security, networking, and storage requirements.

Fleet Command is a secured software stack that offers the ability to control which hardware and software the application has access to within the system on which it is deployed. As a result, there are security best practices, which your application should be designed around:

  • Avoiding privileged containers
  • Separating your admin and app traffic from your storage traffic
  • Minimizing system device access
  • And so on

Design your application deployment around these security requirements, keeping them in mind when configuring the networking and storage later.

The next step is to determine what networking access requirements are needed and how to expose the networking from your container.

Typically, an application requires different ports and routes to access any edge sensors and devices, admin traffic, storage traffic, and application (cloud) traffic. These ports can be exposed from Fleet Command using either NodePorts or more advanced Kubernetes networking configurations, such as the ingress controller.

Lastly, the application may require access to local or remote storage for saving persistent data. Fleet Command supports the capability for hostPath volume mounts. Additional Kubernetes capabilities can also be used, such as persistent volumes and persistent volume claims.

Local path or NFS provisioners can be deployed separately onto the Fleet Command system, if required, to configure local or remote storage. Applications, if they support the capability, can also be configured to connect to cloud storage.

For more information, see the Fleet Command Application Development Guide

Step 3: Build a Helm chart

Now that the application requirements have been determined, it’s time to create a Helm chart

Like containers, there are a few specific requirements for Helm charts on Fleet Command. These requirements are described in the Helm Chart Requirements section of the Application Development Guide. Here is an example of an NVIDIA DeepStream Helm chart as a reference to help build a Helm chart deployed in Fleet Command. 

To create your own Helm chart from scratch, first run the following command to create a sample Helm chart. This command generates a sample Helm chart with a NGINX Docker container, which can then be customized for any application.

$ helm create deepstream

After the Helm chart is created, this is how the chart’s directory structure appears:

deepstream
|-- Chart.yaml          
|-- charts              
|-- templates            
                         
|   |-- NOTES.txt        
|   |-- _helpers.tpl     
|   |-- deployment.yaml
|   |-- ingress.yaml
|   `-- service.yaml
`-- values.yaml   

Next, modify the values.yaml file with the following highlighted values to configure the sample Helm chart for DeepStream container and networking.

image:
 
   repository: nvcr.io/nvidia/deepstream
 
   pullPolicy: IfNotPresent
    # Overrides the image tag whose default is the chart appVersion.
 
   tag: 5.1-21.02-samples
 

imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""

serviceAccount:
    # Specifies whether a service account should be created
 
   create: false
 
   # Annotations to add to the service account
    annotations: {}
    # The name of the service account to use.
    # If not set and create is true, a name is generated using the fullname template
    name: ""

podAnnotations: {}

podSecurityContext: {}
    # fsGroup: 2000

securityContext: {}
    # capabilities:
    #   drop:
    #   - ALL
    # readOnlyRootFilesystem: true
    # runAsNonRoot: true
    # runAsUser: 1000

service:
 
   type: NodePort
 
   port: 8554
 
   nodeport: 31113

After you create a customized Helm chart, it can then be uploaded to a private registry alongside the container.

Step 4: Deploy on Fleet Command

With the application containerized and a Helm chart built, load the application onto Fleet Command. Applications are loaded on NGC, a hub for GPU-accelerated applications, models, and containers, and then made available to deploy on Fleet Command. The application can be public, but can also be hosted in a private registry where access is restricted to the organization.

The entire process is covered step-by-step in the Fleet Command User Guide, but is also showcased in the Fleet Command demo video

Bonus step: Join our ecosystem of partners

Recently, NVIDIA announced an expansion of the NVIDIA Metropolis Partner Program that now includes Fleet Command. Partners of Metropolis that configure their application to be deployed on Fleet Command have access to the solution for free to operate POCs for customers. By using Fleet Command, partners don’t need to build a bespoke solution in a customer environment for an evaluation. They can use Fleet Command and deploy their application at customer sites in minutes. 

Get started on cloud-native

This post covered the core principles of cloud-native technology and how to get applications cloud-native ready with Fleet Command.

Your next step is to get hands-on experience deploying and managing applications in a cloud-native environment. NVIDIA LaunchPad can help.

LaunchPad provides immediate, short-term access to a Fleet Command instance to easily deploy and monitor real applications on real servers. Hands-on labs walk you through the entire process, from infrastructure provisioning and optimization to application deployment in the context of applicable use cases, like deploying a vision AI application at the edge of a network. 

Get started on LaunchPad today for free.

Categories
Misc

Inference Speed on TF2 object detection API

Is there a way to calculate the inference speed using the TF2 object detection API?

submitted by /u/giakou4
[visit reddit] [comments]