DataBloom - Part 486

Misc

Marbles RTX Playable Sample Now Available in NVIDIA Omniverse

Post author By
Post date May 27, 2021
No Comments on Marbles RTX Playable Sample Now Available in NVIDIA Omniverse

Here’s a chance to become a marvel at marbles: the Marbles RTX playable sample is now available from the NVIDIA Omniverse launcher. Marbles RTX is a physics-based mini-game level where a player controls a marble around a scene full of obstacles. The sample, which already has over 8,000 downloads, displays real-time physics with dynamic lighting Read article >

The post Marbles RTX Playable Sample Now Available in NVIDIA Omniverse appeared first on The Official NVIDIA Blog.

Misc

“ValueError: cannot reshape array of size 278540 into shape (256,128,3,3)” Conversion YOLOv3 .weights to .pb

Post author By
Post date May 27, 2021
No Comments on “ValueError: cannot reshape array of size 278540 into shape (256,128,3,3)” Conversion YOLOv3 .weights to .pb

I have trained a YOLO v3 Object Detection Model. To incorporate into my flutter application I am trying to convert it to .tflite, with .pb needed as intermediate. I am getting this error with every github repo I have tried. (A few linked below)

Error: ValueError: cannot reshape array of size 278540 into shape (256,128,3,3)

Following is what my classes.names file looks like:

upstairs

downstairs

I have just 2 classes. I am unable to convert. Can someone please help?

Link to my weights and config file:

A few repos that I have tried:

submitted by /u/mishaalnaeem
[visit reddit] [comments]

Misc

Adding value rules to a tf model(noob)

Post author By
Post date May 27, 2021
No Comments on Adding value rules to a tf model(noob)

Hello all

TLDR: 4 columns in df, sequential model, LSTM, how to add rule column a > column b for all a,b, and get a return for all 4 columns

Thanks for your help in advance.

I’m working on a sequential model. In python. Pretty much teaching myself as I go so I apologize for any incorrect jargon or naïveté.

Assume we have a list of families histories with family members weights x each in column, heaviest to lightest.

Ie Generation. Heaviest. Med heavy. Med light. Lightest

275. 225. 180. 145
300. 250. 225. 165

I have tried two approaches to guess the weights of the next generation. I have 100 generations to iterate over.

The first approach is to just feed the whole df into tf sequential model with lstm. Now maybe I don’t understand exactly what’s happening when I do that, which I don’t, but it returns a single value, not 4. (And I’m not sure it knows that column ‘heaviest’ >’lightest’ for all generations.) So as a work around I thought oh, just split it up and pass each column through its own model and then look at the values. I’m obviously loosing way to many connections because I’m only using 25% of the data at a time and the results are well, not ordered really.

So my long short question is……. if I pass the entire 4 column df, and I want tf to guess each value of the next generation, what do I need to add to force it to guess all 4? And is there a way to simply pass a rule I already know about the data?

submitted by /u/obibongcannobi
[visit reddit] [comments]

Misc

NVIDIA Announces Financial Results for First Quarter Fiscal 2022

Post author By
Post date May 27, 2021
No Comments on NVIDIA Announces Financial Results for First Quarter Fiscal 2022

NVIDIA today reported record revenue for the first quarter ended May 2, 2021, of $5.66 billion, up 84 percent from a year earlier and up 13 percent from the previous quarter, with record revenue from the company’s Gaming, Data Center and Professional Visualization platforms.

Misc

The Roaring 20+: GFN Thursday Game Releases Include Biomutant, Maneater, Warhammer Age of Sigmar: Storm Ground and More

Post author By
Post date May 27, 2021
No Comments on The Roaring 20+: GFN Thursday Game Releases Include Biomutant, Maneater, Warhammer Age of Sigmar: Storm Ground and More

GFN Thursday comes roaring in with 22 games and support for three DLCs joining the GeForce NOW library this week. Among the 22 new releases are five day-and-date game launches: Biomutant, Maneater, King of Seas, Imagine Earth and Warhammer Age of Sigmar: Storm Ground. DLC, Without the Download GeForce NOW ensures your favorite games are Read article >

The post The Roaring 20+: GFN Thursday Game Releases Include Biomutant, Maneater, Warhammer Age of Sigmar: Storm Ground and More appeared first on The Official NVIDIA Blog.

Misc

Error Converting Image to Luma channel

Post author By
Post date May 26, 2021
No Comments on Error Converting Image to Luma channel

I’m trying to convert an RGB image into the luma channel, similar to how it is done in PIL, but I cannot find a good way to do this.

I have tried with tensorflow_io and the values are incorrect.

with tensorflow_io

img_file = tf.io.read_file(“./img/img.jpg”) img = tf.image.decode_jpeg(img_file, channels=3) luma = tfio.experimental.color.rgb_to_ycbcr(img)[:,:,0] luma.numpy() “”” Value: array([[ 22, 22, 22, …, 21, 21, 21], [ 22, 22, 22, …, 21, 21, 21], [ 22, 22, 22, …, 21, 21, 21], …, [159, 159, 156, …, 51, 48, 48], [158, 158, 158, …, 50, 46, 46], [226, 226, 227, …, 230, 231, 231]], dtype=uint8) “””

with PIL

im = Image.open(“./img/img.jpg”) im = im.convert(“L”) np.asarray(im) “”” Value: array([[ 8, 8, 8, …, 6, 6, 6], [ 8, 8, 8, …, 6, 6, 6], [ 8, 8, 8, …, 6, 6, 6], …, [168, 167, 165, …, 41, 38, 38], [167, 167, 167, …, 42, 37, 37], [246, 246, 247, …, 251, 253, 253]], dtype=uint8) “””

Am I doing something wrong here?

submitted by /u/potato-sword
[visit reddit] [comments]

Misc

Most computational efficient way for list of random numbers in Tensroflow given a list of maxiumum values like in `np.random.randint`

For np.random.randint, you can input a list of maximum values, and get a list of random ints from 0 to those maximum values.

np.random.randint([1, 10, 100, 1000] ) >array([ 0, 7, 31, 348])

Tensorflow tf.random.uniform doesn’t allow lists for maxval, so you need to either create a statement for each, or run a loop. I was wondering if there was more elegant way to get these random numbers.

submitted by /u/PrudentAlternative10
[visit reddit] [comments]

Misc

Make a Digital Twin of your Data Center with SONiC running on NVIDIA Air

Post author By
Post date May 26, 2021
No Comments on Make a Digital Twin of your Data Center with SONiC running on NVIDIA Air

We have made it incredibly easy to try-out a full multi-switch network fabric using the Microsoft SONiC operating system – in a virtual data center that is available to anyone free of charge.

Testing out new network gear and running proof of concept (POC) tests for new technology can be difficult at the best of times, but in today’s environment, it’s even harder. We have made it incredibly easy to try-out a full multi-switch network fabric using the Microsoft SONiC operating system – in a virtual data center that is available to anyone free of charge.

Data centers serve a crucial role in business growth and organizations aim to adopt the emerging open networking mindset to enable flexibility to suit their unique business needs and operations.

For an open networking data center to occur, IT departments need to train their staff to plan their networking core replacement and predict future challenges in a relatively short time.

NVIDIA supports “Pure SONiC”, a fully open-source version of SONiC. The Pure SONiC NOS would be ideal for IT departments who don’t want to have another proprietary NOS and want to have full control and flexibility in their data center. Being one of the most significant contributors to “SONiC”, NVIDIA has launched the “SONiC Air” platform to support organizations before a transition using a digital twin and provide an entire network experience.

What’s Supported?

Full CLI and API functionality
Control plane software including BGP, VLANs and containers
Automation and Zero Touch Provisioning (ZTP)
Network monitoring with streaming telemetry
Interop testing between NVIDIA Cumulus Linux and SONiC
Custom topologies and network designs

Using “SONiC Air” enables flexibility in the evaluation process, eliminating the limitations of having a small POC that is not representative of the production environment. Staff can use the platform for free, build an exact network digital twin, validate configurations, confirm security policies or test CI/CD pipelines. In addition to CLI access, the platform provides full software functionality and access to the system core components such as docker containers and APIs. On the other hand, since the platform is software-based it doesn’t support hardware features like “What Just Happened”, to let the end-user know why the ASIC has dropped a packet and assist in troubleshooting.

Beyond POCs, with NVIDIA Air, customers can build a tailor-made network topology, define any connectivity, and create configurations and automation for the initial deployment and ongoing operations, before any hardware even ships. Today’s customers are building their entire network with a digital twin and enabling services the same day equipment is installed.

Planning

Organizations who choose to define their workflow pipelines in advance can dramatically reduce the transition time to an open networking product deployed and fully operational in their production environment and save on both CapEx and OpEx in the long term.

Organizations can use NVIDIA Air for end-to-end evaluation and testing, combining the network,servers and applications. NVIDIA Air not only supports SONiC and Cumulus Linux on the network but also Ubuntu and Red Hat servers. Leveraging Infrastructure As Code, the platform supports integration with production’s CI/CD pipeline and version control repository to test the integrations and build the code for their future production environment.

Staff training

Training resources are always a challenge for IT departments. With the use of NVIDIA Air, IT teams can now give every team member their own private replica of the production environment to learn on. No more waiting for hardware resources to be racked and stacked or balancing limited lab time across multiple users

Get started

Customers who want to try SONiC in NVIDIA Air can watch the “SONiC experience” on-demand, hands-on workshop to help them get started. The workshop covers the basics of SONiC architecture, configuration and troubleshooting. We demystify the SONiC microservices architecture and highlight the different configuration approaches available in SONiC. The hands-on lab provides a step-by-step guide to build and configure a leaf-spine SONiC network from the ground up.

Links

SONiC Air
SONiC Experience OD workshop

Misc

NVIDIA Announces Financial Results for First Quarter Fiscal 2022

Post author By
Post date May 26, 2021
No Comments on NVIDIA Announces Financial Results for First Quarter Fiscal 2022

Offsites

Cross-Modal Contrastive Learning for Text-to-Image Generation

Post author By
Post date May 26, 2021
No Comments on Cross-Modal Contrastive Learning for Text-to-Image Generation

Posted by Han Zhang, Research Scientist and Jing Yu Koh, Software Engineer, Google Research

Automatic text-to-image synthesis, in which a model is trained to generate images from text descriptions alone, is a challenging task that has recently received significant attention. Its study provides rich insights into how machine learning (ML) models capture visual attributes and relate them to text. Compared to other kinds of inputs to guide image creation, such as sketches, object masks or mouse traces (which we have highlighted in prior work), descriptive sentences are a more intuitive and flexible way to express visual concepts. Hence, a strong automatic text-to-image generation system can also be a useful tool for rapid content creation and could be applied to many other creative applications, similar to other efforts to integrate machine learning into the creation of art (e.g., Magenta).

State-of-the-art image synthesis results are typically achieved using generative adversarial networks (GANs), which train two models — a generator, which tries to create realistic images, and a discriminator, which tries to determine if an image is real or fabricated. Many text-to-image generation models are GANs that are conditioned using text inputs in order to generate semantically relevant images. This is significantly challenging, especially when long, ambiguous descriptions are provided. Moreover, GAN training can be prone to mode collapse, a common failure case for the training process in which the generator learns to produce only a limited set of outputs, so that the discriminator fails to learn robust strategies to recognize fabricated images. To mitigate mode collapse, some approaches use multi-stage refinement networks that iteratively refine an image. However, such systems require multi-stage training, which is less efficient than simpler single-stage end-to-end models. Other efforts rely on hierarchical approaches that first model object layouts before finally synthesizing a realistic image. This requires the use of labeled segmentation data, which can be difficult to obtain.

In “Cross-Modal Contrastive Learning for Text-to-Image Generation,” to appear at CVPR 2021, we present the Cross-Modal Contrastive Generative Adversarial Network (XMC-GAN), which addresses text-to-image generation by learning to maximize the mutual information between image and text using inter-modal (image-text) and intra-modal (image-image) contrastive losses. This approach helps the discriminator to learn more robust and discriminative features, so XMC-GAN is less prone to mode collapse even with one-stage training. Importantly, XMC-GAN achieves state-of-the-art performance with a simple one-stage generation, as compared to previous multi-stage or hierarchical approaches. It is end-to-end trainable, and only requires image-text pairs (as opposed to labeled segmentation or bounding box data).

Contrastive Losses for Text-to-Image Synthesis
The goal of text-to-image synthesis systems is to produce clear, photo-realistic scenes with high semantic fidelity to their conditioned text descriptions. To achieve this, we propose to maximize the mutual information between the corresponding pairs: (1) images (real or generated) with a sentence describing the scene; (2) a generated image and a real image with the same description; and (3) regions of an image (real or generated) and words or phrases associated with them.

In XMC-GAN, this is enforced using contrastive losses. Similar to other GANs, XMC-GAN contains a generator for synthesizing images, and a discriminator that is trained to act as a critic between real and generated images. Three sets of data contribute to the contrastive loss in this system — the real images, the text that describes those images, and the images generated from the text descriptions. The individual loss functions for both the generator and the discriminator are combinations of the loss calculated from whole images with the full text description, combined with the loss calculated from sub-divided images with associated words or phrases. Then, for each batch of training data, we calculate the cosine similarity score between each text description and the real images, and likewise, between each text description and the batch of generated images. The goal is for the matching pairs (both text-to-image and real image-to-generated image) to have high similarity scores and for non-matching pairs to have low scores. Enforcing such a contrastive loss allows the discriminator to learn more robust and discriminative features.

Inter-modal and intra-modal contrastive learning in our proposed XMC-GAN text-to-image synthesis model.

Results
We apply XMC-GAN to three challenging datasets — the first was a collection of MS-COCO descriptions of MS-COCO images, and the other two were datasets annotated with Localized Narratives, one of which covers MS-COCO images (which we call LN-COCO) and the other of which describes Open Images data (LN-OpenImages). We find that XMC-GAN achieves a new state of the art on each. The images generated by XMC-GAN depict scenes that are of higher quality than those generated using other techniques. On MS-COCO, XMC-GAN improves the state-of-the-art Fréchet inception distance (FID) score from 24.7 to 9.3, and is significantly preferred by human evaluators.

Selected qualitative results for generated images on MS-COCO.

Similarly, human raters prefer the image quality in XMC-GAN generated images 77.3% of the time, and 74.1% prefer its image-text alignment compared to three other state-of-the-art approaches (CP-GAN, SD-GAN, and OP-GAN) .

Human evaluation on MS-COCO for image quality and text alignment. Annotators rank (anonymized and order-randomized) generated images from best to worst.

XMC-GAN also generalizes well to the challenging Localized Narratives dataset, which contains longer and more detailed descriptions. Our prior work TReCS tackles text-to-image generation for Localized Narratives using mouse trace inputs to improve image generation quality. Despite not receiving mouse trace annotations, XMC-GAN is able to significantly outperform TReCS on image generation on LN-COCO, improving state-of-the-art FID from 48.7 to 14.1. Incorporating mouse traces and other additional inputs into an end-to-end model such as XMC-GAN would be interesting to study in future work.

In addition, we also train and evaluate on the LN-OpenImages, which is more challenging than MS-COCO because the dataset is much larger with images that cover a broader range of subject matter and that are more complex (8.4 objects on average). To the best of our knowledge, XMC-GAN is the first text-to-image synthesis model that is trained and evaluated on Open Images. XMC-GAN is able to generate high quality results, and sets a strong benchmark FID score of 26.9 on this very challenging task.

Random samples of real and generated images on Open Images.

Conclusion and Future Work
In this work, we present a cross-modal contrastive learning framework to train GAN models for text-to-image synthesis. We investigate several cross-modal contrastive losses that enforce correspondence between image and text. For both human evaluations and quantitative metrics, XMC-GAN establishes a marked improvement over previous models on multiple datasets. It generates high quality images that match their input descriptions well, including for long, detailed narratives, and does so while being a simpler, end-to-end model. We believe that this represents a significant advance towards creative applications for image generation from natural language descriptions. As we continue this research, we are continually evaluating responsible approaches, potential applications and risk mitigation, in accordance with our AI Principles.

Acknowledgements
This is a joint work with Jason Baldridge, Honglak Lee, and Yinfei Yang. We would like to thank Kevin Murphy, Zizhao Zhang, Dilip Krishnan for their helpful feedback. We also want to thank the Google Data Compute team for their work on conducting human evaluations. We are also grateful for general support from the Google Research team.