DataBloom - Part 349

Misc

Do you use GPU cloud platform?

Post author By
Post date March 9, 2022
No Comments on Do you use GPU cloud platform?

Hi guys!

We are an GPU cloud platform based on blockchain, we have a plan to support developers about GPU. We want to know what’s the most important reason you choose GPU cloud platform if you are a user of GPU cloud platform. Compared to other GPU cloud platform, do your GPU cloud platform have any special function?

It would be great to get some insight from people who know this and willing to share the comments on this! We will invite 3 of you to get a free GPU for 72 hours. THX

View Poll

submitted by /u/May-Feng
[visit reddit] [comments]

Misc

Object Detection API performance on AMD G Embedded CPUs

Post author By
Post date March 8, 2022
No Comments on Object Detection API performance on AMD G Embedded CPUs

This is my first post here, so hello everyone!
I am using SSD MobileNet V2 FPNLite 320×320 as model currently, but I am just prototyping, so it could change. Basically – I need a relatively low power device which will run my model in real time. Raspberry Pi is out of stock everywhere and I found possible alternative – second hand thin clients. Most of cheap ones have AMD G Embedded CPUs – G-T56N or G-T48E and I couldn’t find anything about them related in any way to machine learning. Will they have enough power to run object detection in real time? How do they compare to RPi 4 or 3 performance? I am obviously fine with bigger form factor and power consumption.

Any help will be appreciated!

submitted by /u/Own-Combination-4238
[visit reddit] [comments]

Misc

NVIDIA Announces Investor Day for Financial Community

Post author By
Post date March 8, 2022
No Comments on NVIDIA Announces Investor Day for Financial Community

NVIDIA will present the following virtual event for the financial community: NVIDIA Investor Day Tuesday, March 22, 2022, at 10 a.m. …

Offsites

Robust Graph Neural Networks

Post author By
Post date March 8, 2022
No Comments on Robust Graph Neural Networks

Posted by Bryan Perozzi, Research Scientist and Qi Zhu, Research Intern, Google Research

Graph Neural Networks (GNNs) are powerful tools for leveraging graph-structured data in machine learning. Graphs are flexible data structures that can model many different kinds of relationships and have been used in diverse applications like traffic prediction, rumor and fake news detection, modeling disease spread, and understanding why molecules smell.

Graphs can model the relationships between many different types of data, including web pages (left), social connections (center), or molecules (right).

As is standard in machine learning (ML), GNNs assume that training samples are selected uniformly at random (i.e., are an independent and identically distributed or “IID” sample). This is easy to do with standard academic datasets, which are specifically created for research analysis and therefore have every node already labeled. However, in many real world scenarios, data comes without labels, and labeling data can be an onerous process involving skilled human raters, which makes it difficult to label all nodes. In addition, biased training data is a common issue because the act of selecting nodes for labeling is usually not IID. For example, sometimes fixed heuristics are used to select a subset of data (which shares some characteristics) for labeling, and other times, human analysts individually choose data items for labeling using complex domain knowledge.

Localized training data is a typical non-IID bias exhibited in graph-structured data. This is shown on the left figure by taking an orange node and expanding to those around it. Instead, an IID training sample of nodes for labeling would be uniformly distributed, as illustrated by the sampling process on the right.

To quantify the amount of bias present in a training set, one can use methods that measure how large the shift is between two different probability distributions, where the size of the shift can be thought of as the amount of bias. As the shift grows in size, machine learning models have more difficulty generalizing from the biased training set. This situation can meaningfully hurt generalizability — on academic datasets, we’ve observed domain shifts causing a performance drop of 15-20% (as measured by the F1 score).

In “Shift-Robust GNNs: Overcoming the Limitations of Localized Graph Training Data”, presented at NeurIPS 2021, we introduce a solution for using GNNs on biased data. Called Shift-Robust GNN (SR-GNN), this approach is designed to account for distributional differences between biased training data and a graph’s true inference distribution. SR-GNN adapts GNN models to the presence of distributional shift between the nodes labeled for training and the rest of the dataset. We illustrate the effectiveness of SR-GNN in a variety of experiments with biased training datasets on common GNN benchmark datasets for semi-supervised learning and show that SR-GNN outperforms other GNN baselines in accuracy, reducing the negative effects of biased training data by 30–40%.

The Impact of Distribution Shifts on Performance
To demonstrate how distribution shift affects GNN performance, we first generate a number of biased training sets for known academic datasets. Then in order to understand the effect, we plot the generalization (test accuracy) versus a measure of distribution shift (the Central Moment Discrepancy¹, CMD). For example, consider the well known PubMed citation dataset, which can be thought of as a graph where the nodes are medical research papers and the edges represent citations between them. When we generate biased training data for PubMed, the plot looks like this:

The effect of distribution shift on the PubMed dataset. Performance (F1) is shown on the y-axis vs. the distribution shift, Central Moment Discrepancy (CMD), on the x-axis, for 100 biased training set samples. As the distribution shift increases, the model’s accuracy falls.

Here one can observe a strong negative correlation between the distribution shift in the dataset and the classification accuracy: as CMD increases, the performance (F1) decreases. That is, GNNs can have difficulty generalizing as their training data looks less like the test dataset.

To address this, we propose a shift-robust regularizer (similar in idea to domain-invariant learning) to minimize the distribution shift between training data and an IID sample from unlabeled data. To do this, we measure the domain shift (e.g., via CMD) in real time as the model is training and apply a direct penalty based on this that forces the model to ignore as much of the training bias as possible. This forces the feature encoders that the model learns for the training data to also work effectively for any unlabeled data, which might come from a different distribution.

The figure below shows what this looks like when compared to a traditional GNN model. We still have the same inputs (the node features X, and the Adjacency Matrix A), and the same number of layers. However at the final embedding Z_k from layer (k) of the GNN is compared against embeddings from unlabeled data points to verify that the model is correctly encoding them.

SR-GNN adds two kinds of regularizations to deep GNN models. First, a domain shift regularization (λ term) minimizes the distance between hidden representations of the labeled (Z_k) and unlabeled (Z_IID) data. Second, the instance weight (β) of the examples can be changed to further approximate the true distribution.

We write this regularization as an additional term in the formula for the model’s loss based on the distance between the training data’s representations and the true data’s distribution (full formulas available in the paper).

In our experiments, we compare our method and a number of standard graph neural network models, to measure their performance on node classification tasks. We demonstrate that adding the SR-GNN regularization gives a 30–40% percent improvement on classification tasks with biased training data labels.

A comparison of SR-GNN using node classification with biased training data on the PubMed dataset. SR-GNN outperforms seven baselines, including DGI, GCN, GAT, SGC and APPNP.

Shift-Robust Regularization for Linear GNNs via Instance Re-weighting
Moreover, it’s worth noting that there’s another class of GNN models (e.g., APPNP, SimpleGCN, etc) that are based on linear operations to speed up their graph convolutions. We also examined how to make these models more reliable in the presence of biased training data. While the same regularization mechanism can not be directly applied due to their different architecture, we can “correct” the training bias by re-weighting the training instances according to their distance from an approximated true distribution. This allows correcting the distribution of the biased training data without passing gradients through the model.

Finally, the two regularizations — for both deep and linear GNNs — can be combined into a generalized regularization for the loss, which combines both domain regularization and instance reweighting (details, including the loss formulas, available in the paper).

Conclusion
Biased training data is common in real world scenarios and can arise due to a variety of reasons, including difficulties of labeling a large amount of data, the various heuristics or inconsistent techniques that are used to choose nodes for labeling, delayed label assignment, and others. We presented a general framework (SR-GNN) that can reduce the influence of biased training data and can be applied to various types of GNNs, including both deeper GNNs and more recent linearized (shallow) versions of these models.

Acknowledgements
Qi Zhu is a PhD Student at UIUC. Thanks to our collaborators Natalia Ponomareva (Google Research) and Jiawei Han (UIUC). Thanks to Tom Small and Anton Tsitsulin for visualizations.

¹We note that many measures of distribution shift have been proposed in the literature. Here we use CMD (as it is quick to calculate and generally shows good performance in the domain adaptation literature), but the concept generalizes to any measure of distribution distances/domain shift. ^↩

Misc

Can’t gpu after upgrading the tensorflow from v2.3.2 to v2.8.

Post author By
Post date March 8, 2022
No Comments on Can’t gpu after upgrading the tensorflow from v2.3.2 to v2.8.

Can't gpu after upgrading the tensorflow from v2.3.2 to v2.8.

Hi,

I recently upgraded tensorflow of my local machine from v2.3.2 to v2.8 to use new features and now the tensorflow is unable to access the gpu. Below is the screenshot of command prompt while new tensorflow executes a gpu command. What should I do to rectify this?

https://preview.redd.it/hgp8dfbm55m81.png?width=1920&format=png&auto=webp&s=0d4d14f7061597d2770a4ae52fe6cb20d409aa13

submitted by /u/Better-Ad8608
[visit reddit] [comments]

Misc

Deploy AI Workloads at Scale with Bottlerocket and NVIDIA-Powered Amazon EC2 Instances

Post author By
Post date March 7, 2022
No Comments on Deploy AI Workloads at Scale with Bottlerocket and NVIDIA-Powered Amazon EC2 Instances

AWS and NVIDIA collaborated on Bottlerocket, a container-optimized OS, to support all NVIDIA powered Amazon EC2 instances like the P4d, P3, G4dn, and G5 instances.

Deploying AI-powered services like voice-based assistants, e-commerce product recommendations, and contact-center automation into production at scale is challenging. Delivering the best end-user experience while reducing operational costs requires accounting for multiple factors. These include composition and performance of underlying infrastructure, flexibility to scale resources based on user-demand, cluster management overhead, and security.

To address the challenges of deploying AI at scale, Enterprise IT teams have adopted Kubernetes (K8s) for container orchestration and NVIDIA accelerated computing to meet the performance needs of production AI deployments. In addition, there’s a growing focus on the role of the operating system (OS) for production infrastructure. The host OS of the production environment has a direct impact on the security, resource utilization, and time it takes to provision and scale additional resources. This influences the user experience, security, and cost of deployments as user demand increases.

Botterocket: a Linux-based container-optimized OS

Bottlerocket is a minimal, Linux based open-source OS developed by AWS that is purpose built for running containers. With a strong emphasis on security, it only includes essential software for running containers.

This reduces the attack surface and impact of vulnerabilities, requiring less effort to meet node compliance requirements. In addition, the minimal host footprint of Bottlerocket helps improve node resource usage and boot times.

Updates to Bottlerocket are applied in a single step and can be rolled back if necessary. This results in lower error rates and improved uptime for container applications. They can also be automated using container orchestration services such as Amazon Elastic Kubernetes Service(EKS) and Amazon Elastic Container Service (ECS).

Use Bottlerocket with Amazon EC2 instances powered by NVIDIA GPUs

AWS and NVIDIA have collaborated to enable Bottlerocket to support all NVIDIA-powered Amazon EC2 instances including P4d, P3, G4dn, and G5. This support combines the computational power of NVIDIA-powered GPU instances with the benefits of a container-optimized OS for deploying AI models on K8s clusters at scale.

The result is enhanced security and faster boot times, especially when running AI workloads scaling additional GPU-based instances in real time.

An illustration of the various applications that can be deployed. — *Figure 1: Containerized GPU-optimized applications can be deployed on K8s clusters using Bottlerocket support for NVIDIA-powered Amazon EC2 instances*.

Support for NVIDIA GPUs is delivered in the form of the Bottlerocket GPU-optimized AMI. This includes NVIDIA drivers, a K8s GPU device-plugin, and containerd runtime built into the base image.

The AMI provides everything to provision and register self-managed nodes, with NVIDIA-powered GPU instances and Bottlerocket OS to an Amazon EKS cluster.

In addition, you can also leverage NVIDIA optimized software from the NVIDIA NGC Catalog on AWS Marketplace—a hub for pretrained models, scripts, Helm charts, and a wide array of AI and HPC software.

For AI inference deployments on AWS, you can leverage the NVIDIA Triton Inference Server. Use the open-source inference serving software to deploy trained AI models from many frameworks including TensorFlow, TensorRT, PyTorch, ONNX, XGBoost, and Python on any GPU or CPU infrastructure.

Learn more about the Bottlerocket support for NVIDIA GPUs from AWS.

Offsites

Learning from Weakly-Labeled Videos via Sub-Concepts

Post author By
Post date March 7, 2022
No Comments on Learning from Weakly-Labeled Videos via Sub-Concepts

Posted by Zizhao Zhang and Guanhang Wu, Software Engineers, Google Research, Cloud AI Team

Video recognition is a core task in computer vision with applications from video content analysis to action recognition. However, training models for video recognition often requires untrimmed videos to be manually annotated, which can be prohibitively time consuming. In order to reduce the effort of collecting videos with annotations, learning visual knowledge from videos with weak labels, i.e., where the annotation is auto-generated without manual intervention, has attracted growing research interest, thanks to the large volume of easily accessible video data. Untrimmed videos, for example, are often acquired by querying with keywords for classes that the video recognition model aims to classify. A keyword, which we refer to as a weak label, is then assigned to each untrimmed video obtained.

Although large-scale videos with weak labels are easier to collect, training with unverified weak labels poses another challenge in developing robust models. Recent studies have demonstrated that, in addition to the label noise (e.g., incorrect action labels on untrimmed videos), there is temporal noise due to the lack of accurate temporal action localization — i.e., an untrimmed video may include other non-targeted content or may only show the target action in a small proportion of the video.

Reducing noise effects for large-scale weakly-supervised pre-training is critical but particularly challenging in practice. Recent work indicates that querying short videos (e.g., ~1 minute in length) to obtain more accurate temporal localization of target actions or applying a teacher model to do filtering can yield improved results. However, such data pre-processing methods prevent models from fully utilizing available video data, especially longer videos with richer content.

In “Learning from Weakly-Labeled Web Videos via Exploring Sub-Concepts“, we propose a solution to these issues that uses a simple learning framework to conduct effective pre-training on untrimmed videos. Instead of simply filtering the potential temporal noise, this approach converts such “noisy” data to useful supervision by creating a new set of meaningful “middle ground” pseudo-labels that expand the original weak label space, a novel concept we call Sub-Pseudo Label (SPL). The model is pre-trained on this more “fine-grained” space and then fine-tuned on a target dataset. Our experiments demonstrate that the learned representations are much better than previous approaches. Moreover, SPL has been shown to be effective in improving the action recognition model quality for Google Cloud Video AI, which enables content producers to easily search through massive libraries of their video assets to quickly source content of interest.

Sampled training clips may represent a different visual action (whisking eggs) from the query label of the whole untrimmed video (baking cookies). SPL converts the potential label noise to useful supervision signals by creating a new set of “middle ground” pseudo-classes (i.e., sub-concepts) via extrapolating two related action classes. Enriched supervision is provided for effective model pre-training.

Sub-Pseudo Label (SPL)
SPL is a simple technique that advances the teacher-student training framework, which is known to be effective for self-training and to improve semi-supervised learning. In the teacher-student framework, a teacher model is trained on high-quality labeled data and then assigns pseudo-labels to unlabeled data. The student model trains on both high-quality labeled data and the unlabeled data that has the teacher-predicted labels. While previous methods have proposed a number of ways to improve the pseudo-label quality, SPL takes a novel approach that combines knowledge from both weak labels (i.e., query text used to acquire data) and teacher-predicted labels, which results in better pseudo-labels overall. This method focuses on video recognition where temporal noise is challenging, but it can be extended easily to other domains, like image classification.

The overall pre-training framework for learning from weakly labeled videos via SPLs. Each trimmed video clip is re-labeled using SPL given the teacher-predicted labels and the weak labels used to query the corresponding untrimmed video.

The SPL method is motivated by the observation that within an untrimmed video “noisy” video clips have semantic relations with the target action (i.e., the weak label class), but may also include essential visual components of other actions, such as the teacher model–predicted class. Our approach uses the extrapolated SPLs from weak labels together with the distilled labels to capture the enriched supervision signals, encouraging learning better representations during pre-training that can be used for downstream fine-tuning tasks.

It is straightforward to determine the SPL class for each video clip. We first perform inference on each video clip using the teacher model trained from a target dataset to get a teacher prediction class. Each clip is also labeled by the class (i.e., query text) of the untrimmed source video. A 2-dimensional confusion matrix is used to summarize the alignments between the teacher model inferences and the original weak annotations. Based on this confusion matrix, we conduct label extrapolation between teacher model predictions and weak labels to obtain the raw SPL label space.

Left: The confusion matrix, which is the basis of the raw SPL label space. Middle: The resulting SPL label spaces (16 classes in this example). Right: SPL-B, another SPL version, that reduces the label space by collating agreed and disagreed entries of each row as independent SPL classes, which in this example results in only 8 classes.

Effectiveness of SPL
We evaluate the effectiveness of SPL in comparison to different pre-training methods applied to a 3D ResNet50 model that is fine-tuned on Kinetics-200 (K200). One pre-training approach simply initializes the model using ImageNet. The other pre-training methods use 670k video clips sampled from an internal dataset of 147k videos, collected following standard processes similar to those described for Kinetics-200, that cover a broad range of actions. Weak label training and teacher prediction training use either the weak labels or teacher-predicted labels on the videos, respectively. Agreement filtering uses only the training data for which the weak labels and teacher-predicted labels match. We find that SPL outperforms each of these methods. Though the dataset used to illustrate the SPL approach was constructed for this work, in principle the method we describe applies to any dataset that has weak labels.

Pre-training Method	Top-1	Top-5
ImageNet Initialized	80.6	94.7
Weak Label Train	82.8	95.6
Teacher Prediction Train	81.9	95.0
Agreement Filtering Train	82.9	95.4
*SPL*	84.3	95.7

We also demonstrate that sampling more video clips from a given number of untrimmed videos can help improve the model performance. With a sufficient number of video clips available, SPL methods consistently outperform weak label pre-training by providing enriched supervision.

As more clips are sampled from 147K videos, the label noise is increased gradually. SPL becomes more and more effective at utilizing the weakly-labeled clips to achieve better pre-training.

We visualize the visual concepts learned from SPL with attention visualization by applying Grad-CAM on the trained model. It is interesting to observe some meaningful “middle ground” concepts that can be learned by SPL.

Examples of attention visualization for SPL classes. Some meaningful “middle ground” concepts can be learned by SPL, such as mixing up the eggs and flour (left) and using the abseiling equipment (right).

Conclusion
We demonstrate that SPLs can provide enriched supervision for pre-training. SPL does not increase training complexity and can be treated as an off-the-shelf technique to integrate with teacher-student–based training frameworks. We believe this is a promising direction for discovering meaningful visual concepts by bridging weak labels and the knowledge distilled from teacher models. SPL has also demonstrated promising generalization to the image recognition domain and we expect future extensions that apply to tasks that have noise in labels. We have successfully applied SPL for Google Cloud Video AI where it has improved the accuracy of the action recognition models, helping users to better understand, search, and monetize their video content library.

Acknowledgements
We gratefully acknowledge the contributions of other co-authors, including Kunpeng Li, Xuehan Xiong, Chen-Yu Lee, Zhichao Lu, Yun Fu, Tomas Pfister. We also thank Debidatta Dwibedi, David A Ross, Chen Sun, Jonathan C. Stroud, and Wei Hua for their valuable comments and help on this work, and Tom Small for figure creation.

Misc

Storage Specialist Excelero Joins NVIDIA

Post author By
Post date March 7, 2022
No Comments on Storage Specialist Excelero Joins NVIDIA

Excelero, a Tel Aviv-based provider of high-performance software-defined storage, is now a part of NVIDIA. The company’s team of engineers — including its seasoned co-founders with decades of experience in HPC, storage and networking — bring deep expertise in the block storage that large businesses use in storage-area networks. Now their mission is to help Read article >

The post Storage Specialist Excelero Joins NVIDIA appeared first on NVIDIA Blog.

Misc

unable to save&load tensorflow seq model

Post author By
Post date March 6, 2022
No Comments on unable to save&load tensorflow seq model

I’m trying to train a tensorflow keras sequential model with multiple layers, save it, and reload it for testing. But the accuracy is never preserved over saving/loading. I saw that this was a bug that was supposedly fixed in best tensorflow versions, but I have the most recent tensorflow update with no luck. Does anyone else have this problem and any idea of how to fix? Even my CS professor told me to give up and use pytorch instead 😭

submitted by /u/zhen9hui9x
[visit reddit] [comments]

Misc

Do I need to run an object recognition prior to classification to isolate individuals?

Post author By
Post date March 6, 2022
No Comments on Do I need to run an object recognition prior to classification to isolate individuals?

I’m helping my daughter setup a camera that will identify birds that show up to her bird feeders using this (https://tfhub.dev/google/aiy/vision/classifier/birds_V1/1) model. I’ve got it working (haven’t started running it with the camera yet, but I already figured out how to get the frames off of that…) BUT, it seems to have a LOT of trouble making an identification when there are multiple birds in the shot. For example, a picture of an empty birdfeeder came back as chickadee at ~20%, but a picture with 8 sparrows came back as sparrow but only at ~15%… similarly, a very clear shot of a cardinal with 5 other birds came back as cardinal but only at 21%… I do understand that the model isn’t going to be perfectly accurate and 20% means the model isn’t very confident (I’m not concerned about that…), but I need to set some bottom thresh-hold and I’m concerned this means any time there are multiple birds at the feeders the system will basically stop working which would end up being most of the time…

So do i need to run an object detection model on the images and clip out individual images of “birds” and then run this identification model? and if so, would anyone have a suggestion on an easyish way to do this? I’m far from a competent coder so advice/suggestions are welcome!

submitted by /u/StrongAbbreviations5
[visit reddit] [comments]