DataBloom - Part 345

Misc

Orchestrated to Perfection: NVIDIA Data Center Grooves to Tune of Millionfold Speedups

Post author By
Post date March 24, 2022
No Comments on Orchestrated to Perfection: NVIDIA Data Center Grooves to Tune of Millionfold Speedups

The hum of a bustling data center is music to an AI developer’s ears — and NVIDIA data centers have found a rhythm of their own, grooving to the swing classic “Sing, Sing, Sing” in this week’s GTC keynote address. The lighthearted video, created with the NVIDIA Omniverse platform, features Louis Prima’s iconic music track, Read article >

The post Orchestrated to Perfection: NVIDIA Data Center Grooves to Tune of Millionfold Speedups appeared first on NVIDIA Blog.

Misc

Take Control This GFN Thursday With New Stratus+ Controller From SteelSeries

Post author By
Post date March 24, 2022
No Comments on Take Control This GFN Thursday With New Stratus+ Controller From SteelSeries

GeForce NOW gives you the power to game almost anywhere, at GeForce quality. And with the latest controller from SteelSeries, members can stay in control of the action on Android and Chromebook devices. This GFN Thursday takes a look at the SteelSeries Stratus+, now part of the GeForce NOW Recommended program. And it wouldn’t be Read article >

The post Take Control This GFN Thursday With New Stratus+ Controller From SteelSeries appeared first on NVIDIA Blog.

Misc

What’s the utility of the audio embeddings from Google Audioset for audio classification?

Post author By
Post date March 24, 2022
No Comments on What’s the utility of the audio embeddings from Google Audioset for audio classification?

I have extracted the audio embeddings from Google Audioset corpus (https://research.google.com/audioset/dataset/index.html). The audio embeddings contain a list of “bytes_lists” which is similar to the following

 feature { bytes_list { value: "#226]06(N223K377207r36333337700Y322v935130300377311375215E342377J0000_00370222:2703773570024500377213jd267353377J33$2732673073537700207244Q00002060000312356<R325g30335616N224377270377237240377377321252j357O217377377,33000377|24600133400377357212267300b000000251236002333500326377327327377377223009{" } }

From the documentation and forum discussions, I learnt that these embeddings are the output of a pretrained model (MFCC+CNN) of the 10 second chunks of respective youtube videos. I have also learnt that these embeddings make it easy to work on deep learning models. How does it help the ML engineers?

My confusion is if these audio embeddings are already pre-trained, what are the utilities of these audio embeddings? i.e. How can I use these embeddings to train advanced models for performing Sound Event Detection?

submitted by /u/sab_1120
[visit reddit] [comments]

Misc

Tensorflow Transfer Learning (VGG16) Error: ValueError: Shapes (None, 1) and (None, 4) are incompatible

Post author By
Post date March 23, 2022
No Comments on Tensorflow Transfer Learning (VGG16) Error: ValueError: Shapes (None, 1) and (None, 4) are incompatible

Hello! So I am trying to create a multiclass classifier using VGG16 in transfer learning to classify users’I emotions. The data is sorted into 4 classes, which have their proper directories so I can use the ‘image_dataset_from_directory’ function.

def dataset_creator(directory=""): from keras.preprocessing.image import ImageDataGenerator data = image_dataset_from_directory(directory=directory,labels='inferred') return data train_ds = dataset_creator(directory=traindir) val_set = dataset_creator(directory="~/Documents/CC/visSystems/val_set/") print(type(train_ds)) num_classes = 4 base_model = VGG16(weights="imagenet", include_top=False, input_shape=(256,256,3),classes=4) base_model.trainable = False normalization_layer = layers.Rescaling(scale=1./127.5, offset=-1) flatten_layer = layers.Flatten() dense_layer_0 = layers.Dense(520, activation='relu') dense_layer_1 = layers.Dense(260, activation='relu') dense_layer_2 = layers.Dense(160, activation='relu') dense_layer_3 = layers.Dense(80, activation='relu') prediction_layer = layers.Dense(4, activation='softmax') model = models.Sequential([ base_model, normalization_layer, flatten_layer, dense_layer_1, dense_layer_2, dense_layer_3, prediction_layer ]) from tensorflow.keras.callbacks import EarlyStopping model.compile( optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'], ) es = EarlyStopping(monitor='val_accuracy', mode='max', patience=3, restore_best_weights=True) model.fit(train_ds,validation_data=val_set, epochs=10, callbacks=[es]) model.save("~/Documents/CC/visSystems/affect2model/saved_model")

My code correctly identifies X number of images to 4 classes, but when I try to execute model.fit() it returns this error:

ValueError: in user code: File "/home/blabs/.local/lib/python3.9/site-packages/keras/engine/training.py", line 878, in train_function * return step_function(self, iterator) File "/home/blabs/.local/lib/python3.9/site-packages/keras/engine/training.py", line 867, in step_function ** outputs = model.distribute_strategy.run(run_step, args=(data,)) File "/home/blabs/.local/lib/python3.9/site-packages/keras/engine/training.py", line 860, in run_step ** outputs = model.train_step(data) File "/home/blabs/.local/lib/python3.9/site-packages/keras/engine/training.py", line 809, in train_step loss = self.compiled_loss( File "/home/blabs/.local/lib/python3.9/site-packages/keras/engine/compile_utils.py", line 201, in __call__ loss_value = loss_obj(y_t, y_p, sample_weight=sw) File "/home/blabs/.local/lib/python3.9/site-packages/keras/losses.py", line 141, in __call__ losses = call_fn(y_true, y_pred) File "/home/blabs/.local/lib/python3.9/site-packages/keras/losses.py", line 245, in call ** return ag_fn(y_true, y_pred, **self._fn_kwargs) File "/home/blabs/.local/lib/python3.9/site-packages/keras/losses.py", line 1664, in categorical_crossentropy return backend.categorical_crossentropy( File "/home/blabs/.local/lib/python3.9/site-packages/keras/backend.py", line 4994, in categorical_crossentropy target.shape.assert_is_compatible_with(output.shape) ValueError: Shapes (None, 1) and (None, 4) are incompatible

How can I approach solving this issue? Thank you for your help.

submitted by /u/blevlabs
[visit reddit] [comments]

Misc

Newb Question: How to host and load Tensorflow Models (as a directory) in the Cloud?

Post author By
Post date March 23, 2022
No Comments on Newb Question: How to host and load Tensorflow Models (as a directory) in the Cloud?

We have a Tensorflow workflow and model that works great when used in a local environment (Python) – however, we now need to push it to production (Heroku). So we’re thinking we need to move our model into some type of Cloud hosting.

If possible, I’d like to upload the model directory (not an H5 file) to a cloud service/storage provider and then load that model into Tensorflow.

Here is how we’re currently loading in a model, and what we’d like to be able to do:

# Current setup loads model from local directory dnn_model = tf.keras.models.load_model('./neural_network/true_overall) # We'd like to be able to load the model from a cloud service/storage dnn_model = tf.keras.models.load_model('https://some-kinda-storage-service.com/neural_network/true_overall)

Downloading the directory and running it from a temp directory isn’t an option with our setup – so we’ll need to be able to run the model from the cloud. We don’t necessarily need to “train” the model in the cloud, we just need to be able to load it.

I’ve looked into some things like TensorServe and TensorCloud, but I’m not 100% sure if thats what we need (we’re super new to Tensorflow and AI in general).

What’s the best way to get the models (as a directory) into the cloud so we can load them into our code?

submitted by /u/jengl
[visit reddit] [comments]

Offsites

Auto-generated Summaries in Google Docs

Post author By
Post date March 23, 2022
No Comments on Auto-generated Summaries in Google Docs

Posted by Mohammad Saleh, Software Engineer, Google Research, Brain Team and Anjuli Kannan, Software Engineer, Google Docs

For many of us, it can be challenging to keep up with the volume of documents that arrive in our inboxes every day: reports, reviews, briefs, policies and the list goes on. When a new document is received, readers often wish it included a brief summary of the main points in order to effectively prioritize it. However, composing a document summary can be cognitively challenging and time-consuming, especially when a document writer is starting from scratch.

To help with this, we recently announced that Google Docs now automatically generates suggestions to aid document writers in creating content summaries, when they are available. Today we describe how this was enabled using a machine learning (ML) model that comprehends document text and, when confident, generates a 1-2 sentence natural language description of the document content. However, the document writer maintains full control — accepting the suggestion as-is, making necessary edits to better capture the document summary or ignoring the suggestion altogether. Readers can also use this section, along with the outline, to understand and navigate the document at a high level. While all users can add summaries, auto-generated suggestions are currently only available to Google Workspace business customers. Building on grammar suggestions, Smart Compose, and autocorrect, we see this as another valuable step toward improving written communication in the workplace.

A blue summary icon appears in the top left corner when a document summary suggestion is available. Document writers can then view, edit, or ignore the suggested document summary.

Model Details
Automatically generated summaries would not be possible without the tremendous advances in ML for natural language understanding (NLU) and natural language generation (NLG) over the past five years, especially with the introduction of Transformer and Pegasus.

Abstractive text summarization, which combines the individually challenging tasks of long document language understanding and generation, has been a long-standing problem in NLU and NLG research. A popular method for combining NLU and NLG is training an ML model using sequence-to-sequence learning, where the inputs are the document words, and the outputs are the summary words. A neural network then learns to map input tokens to output tokens. Early applications of the sequence-to-sequence paradigm used recurrent neural networks (RNNs) for both the encoder and decoder.

The introduction of Transformers provided a promising alternative to RNNs because Transformers use self-attention to provide better modeling of long input and output dependencies, which is critical in document summarization. Still, these models require large amounts of manually labeled data to train sufficiently, so the advent of Transformers alone was not enough to significantly advance the state-of-the-art in document summarization.

The combination of Transformers with self-supervised pre-training (e.g., BERT, GPT, T5) led to a major breakthrough in many NLU tasks for which limited labeled data is available. In self-supervised pre-training, a model uses large amounts of unlabeled text to learn general language understanding and generation capabilities. Then, in a subsequent fine-tuning stage, the model learns to apply these abilities on a specific task, such as summarization or question answering.

The Pegasus work took this idea one step further, by introducing a pre-training objective customized to abstractive summarization. In Pegasus pre-training, also called Gap Sentence Prediction (GSP), full sentences from unlabeled news articles and web documents are masked from the input and the model is required to reconstruct them, conditioned on the remaining unmasked sentences. In particular, GSP attempts to mask sentences that are considered essential to the document through different heuristics. The intuition is to make the pre-training as close as possible to the summarization task. Pegasus achieved state-of-the-art results on a varied set of summarization datasets. However, a number of challenges remained to apply this research advancement into a product.

Applying Recent Research Advances to Google Docs

Data
Self-supervised pre-training results in an ML model that has general language understanding and generation capabilities, but a subsequent fine-tuning stage is critical for the model to adapt to the application domain. We fine-tuned early versions of our model on a corpus of documents with manually-generated summaries that were consistent with typical use cases.

However, early versions of this corpus suffered from inconsistencies and high variation because they included many types of documents, as well as many ways to write a summary — e.g., academic abstracts are typically long and detailed, while executive summaries are brief and punchy. This led to a model that was easily confused because it had been trained on so many different types of documents and summaries that it struggled to learn the relationships between any of them.

Fortunately, one of the key findings in the Pegasus work was that an effective pre-training phase required less supervised data in the fine-tuning stage. Some summarization benchmarks required as few as 1,000 fine-tuning examples for Pegasus to match the performance of Transformer baselines that saw 10,000+ supervised examples — suggesting that one could focus on quality rather than quantity.

We carefully cleaned and filtered the fine-tuning data to contain training examples that were more consistent and represented a coherent definition of summaries. Despite the fact that we reduced the amount of training data, this led to a higher quality model. The key lesson, consistent with recent work in domains like dataset distillation, was that it was better to have a smaller, high quality dataset, than a larger, high-variance dataset.

Serving
Once we trained the high quality model, we turned to the challenge of serving the model in production. While the Transformer version of the encoder-decoder architecture is the dominant approach to train models for sequence-to-sequence tasks like abstractive summarization, it can be inefficient and impractical to serve in real-world applications. The main inefficiency comes from the Transformer decoder where we generate the output summary token by token through autoregressive decoding. The decoding process becomes noticeably slow when summaries get longer since the decoder attends to all previously generated tokens at each step. RNNs are a more efficient architecture for decoding since there is no self-attention with previous tokens as in a Transformer model.

We used knowledge distillation, which is the process of transferring knowledge from a large model to a smaller more efficient model, to distill the Pegasus model into a hybrid architecture of a Transformer encoder and an RNN decoder. To improve efficiency we also reduced the number of RNN decoder layers. The resulting model had significant improvements in latency and memory footprint while the quality was still on par with the original model. To further improve the latency and user experience, we serve the summarization model using TPUs, which provide significant speed ups and allow more requests to be handled by a single machine.

Ongoing Challenges and Next Steps
While we are excited by the progress so far, there are a few challenges we are continuing to tackle:

Document coverage: Developing a set of documents for the fine-tuning stage was difficult due to the tremendous variety that exists among documents, and the same challenge is true at inference time. Some of the documents our users create (e.g., meeting notes, recipes, lesson plans and resumes) are not suitable for summarization or can be difficult to summarize. Currently, our model only suggests a summary for documents where it is most confident, but we hope to continue broadening this set as our model improves.
Evaluation: Abstractive summaries need to capture the essence of a document while being fluent and grammatically correct. A specific document may have many summaries that can be considered correct, and different readers may prefer different ones. This makes it hard to evaluate summaries with automatic metrics only, user feedback and usage statistics will be critical for us to understand and keep improving quality.
Long documents: Long documents are some of the toughest documents for the model to summarize because it is harder to capture all the points and abstract them in a single summary, and it can also significantly increase memory usage during training and serving. However, long documents are perhaps most useful for the model to automatically summarize because it can help document writers get a head start on this tedious task. We hope we can apply the latest ML advancements to better address this challenge.

Conclusion
Overall, we are thrilled that we can apply recent progress in NLU and NLG to continue assisting users with reading and writing. We hope the automatic suggestions now offered in Google Workspace make it easier for writers to annotate their documents with summaries, and help readers comprehend and navigate documents more easily.

Acknowledgements
The authors would like to thank the many people across Google that contributed to this work: AJ Motika, Matt Pearson-Beck, Mia Chen, Mahdis Mahdieh, Halit Erdogan, Benjamin Lee, Ali Abdelhadi, Michelle Danoff, Vishnu Sivaji, Sneha Keshav, Aliya Baptista, Karishma Damani, DJ Lick, Yao Zhao, Peter Liu, Aurko Roy, Yonghui Wu, Shubhi Sareen, Andrew Dai, Mekhola Mukherjee, Yinan Wang, Mike Colagrosso, and Behnoosh Hariri. .

Misc

What Is Path Tracing?

Turn on your TV. Fire up your favorite streaming service. Grab a Coke. A demo of the most important visual technology of our time is as close as your living room couch. Propelled by an explosion in computing power over the past decade and a half, path tracing has swept through visual media. It brings Read article >

The post What Is Path Tracing? appeared first on NVIDIA Blog.

Misc

TinyML Gearbox Fault Prediction on a $4 MCU

Post author By
Post date March 23, 2022
No Comments on TinyML Gearbox Fault Prediction on a $4 MCU

TinyML Gearbox Fault Prediction on a $4 MCU

I would like to share my project and show you how to apply tinyML approach to detect broken tooth conditions in the gearbox based upon recorded vibration data.
I used Raspberry Pi Pico, Arduino IDE, Neuton Tiny ML software
I will give an answer to such a questions as:
Is it possible to make an AI-driven system that predicts gearbox failure on a simple $4 MCU? How to automatically build a compact model that does not require any additional compression? Can a non-data scientist implement such projects successfully?

Introduction and Business Constraint

In industry (e.g., wind power, automotive), gearboxes often operate under random speed variations. A condition monitoring system is expected to detect faults, broken tooth conditions and assess their severity using vibration signals collected under different speed profiles.

Modern cars have hundreds of thousands of details and systems where it is necessary to predict breakdowns, control the state of temperature, pressure, etc.As such, in the automotive industry, it is critically important to create and embed TinyML models that can perform right on the sensors and open up a set of technological advantages, such as:

Internet independence
No waste of energy and money on data transfer
Advanced privacy and security

In my experiment I want to show how to easily create such a technology prototype to popularize the TinyML approach and use its incredible capabilities for the automotive industry.

https://preview.redd.it/9yqxlo08e5p81.png?width=1224&format=png&auto=webp&s=7e94e1cdc8fc3f9feff146052537faa6d887ffa1

Technologies Used

Neuton TinyML: Neuton**,** I selected this solution since it is free to use and automatically creates tiny machine learning models deployable even on 8-bit MCUs. According to Neuton developers, you can create a compact model in one iteration without compression.
Raspberry Pi Pico: The chip employs two ARM Cortex-M0 + cores, 133 megahertz, which are also paired with 256 kilobytes of RAM when mounted on the chip. The device supports up to 16 megabytes of off-chip flash storage, has a DMA controller, and includes two UARTs and two SPIs, as well as two I2C and one USB 1.1 controller. The device received 16 PWM channels and 30 GPIO needles, four of which are suitable for analog data input. And with a net $4 price tag.

https://preview.redd.it/i9hecq8be5p81.png?width=1265&format=png&auto=webp&s=324da97a54104f7b42ecc52e9679118c75d04580

The goal of this tutorial is to demonstrate how you can easily build a compact ML model to solve a multi-class classification task to detect broken tooth conditions in the gearbox.

Dataset Description

Gearbox Fault Diagnosis Dataset includes the vibration dataset recorded by using SpectraQuest’s Gearbox Fault Diagnostics Simulator.

Dataset has been recorded using 4 vibration sensors placed in four different directions and under variation of load from ‘0’ to ’90’ percent. Two different scenarios are included:1) Healthy condition 2) Broken tooth condition

There are 20 files in total, 10 for a healthy gearbox and 10 for a broken one. Each file corresponds to a given load from 0% to 90% in steps of 10%. You can find this dataset through the link: https://www.kaggle.com/datasets/brjapon/gearbox-fault-diagnosis

https://preview.redd.it/xyihhiwde5p81.png?width=1220&format=png&auto=webp&s=c5bbec2110bf62a3d89416ed4c7dbfc434912919

The experiment will be conducted on a $4 MCU, with no cloud computing carbon footprints 🙂

Step 1: Model training

For model training, I’ll use the free of charge platform, Neuton TinyML. Once the solution is created, proceed to the dataset uploading (keep in mind that the currently supported format is CSV only).

https://preview.redd.it/655xf0uhe5p81.png?width=920&format=png&auto=webp&s=6b4f87374ef3f21aad3cd79b64303b7b2334e67d

https://preview.redd.it/1m8dw3rie5p81.png?width=922&format=png&auto=webp&s=78ee80b7d7c9ae09f85c2490bc96e69e8687857d

https://preview.redd.it/ch9l64nle5p81.png?width=740&format=png&auto=webp&s=1d902649fb8ca03e7082ebc1c4d8803cf910db38

Number of coefficients = 397, File Size for Embedding = 2.52 Kb. That’s super cool! It is a really small model!Upon the model training completion, click on the Prediction tab, and then click on the Download button next to Model for Embedding to download the model library file that we are going to use for our device.

Step 2: Embedding on Raspberry Pico

Once you have downloaded the model files, it’s time to add our custom functions and actions. I am using Arduino IDE to program Raspberry Pico.

Setting up Arduino IDE for Raspberry Pico:

https://reddit.com/link/tkw3e1/video/qsmo4yepe5p81/player

https://preview.redd.it/w5paiptje5p81.png?width=880&format=png&auto=webp&s=ff61cb8124ec4a9d371c52adc6dd9cafb9bf25fc

https://preview.redd.it/k2scjbwse5p81.png?width=890&format=png&auto=webp&s=bad649ff88631896cf58acffdaee6700f581528f

Note: Since we are going to make classification on the test dataset, we will use the CSV utility provided by Neuton to run inference on the data sent to the MCU via USB.

Here is our project directory,

https://preview.redd.it/qqol86o3f5p81.png?width=903&format=png&auto=webp&s=b04430faf2f340a01d8a795e302f6db67eb1b2eb

https://preview.redd.it/dp85hfs4f5p81.png?width=645&format=png&auto=webp&s=fde676f02850c6ed9d56682668e765741bf58320

https://preview.redd.it/usq289n5f5p81.png?width=669&format=png&auto=webp&s=3ad529992b22473d1d773a5ad498acc145fb4e33

I tried to build the same model with TensorFlow and TensorFlow Lite as well. My model built with Neuton TinyML turned out to be 4.3% better in terms of Accuracy and 15.3 times smaller in terms of model size than the one built with TF Lite. Speaking of the number of coefficients, TensorFlow’s model has, 9, 330 coefficients, while Neuton’s model has only 397 coefficients (which is 23.5 times smaller than TF!).

The resultant model footprint and inference time are as follows:

https://preview.redd.it/89xwqbt8f5p81.png?width=740&format=png&auto=webp&s=7ad3b0ff53614a291d059214a20c0100ed6ccbac

submitted by /u/sumitaiml
[visit reddit] [comments]

Misc

NVIDIA Showcases Novel AI Tools in DRIVE Sim to Advance Autonomous Vehicle Development

Post author By
Post date March 23, 2022
No Comments on NVIDIA Showcases Novel AI Tools in DRIVE Sim to Advance Autonomous Vehicle Development

Autonomous vehicle development and validation require the ability to replicate real-world scenarios in simulation. At GTC, NVIDIA founder and CEO Jensen Huang showcased new AI-based tools for NVIDIA DRIVE Sim that accurately reconstruct and modify actual driving scenarios. These tools are enabled by breakthroughs from NVIDIA Research that leverage technologies such as NVIDIA Omniverse platform Read article >

The post NVIDIA Showcases Novel AI Tools in DRIVE Sim to Advance Autonomous Vehicle Development appeared first on NVIDIA Blog.

Misc

NVIDIA Inception Introduces New and Updated Benefits for Startup Members to Accelerate Computing

Post author By
Post date March 23, 2022
No Comments on NVIDIA Inception Introduces New and Updated Benefits for Startup Members to Accelerate Computing

This week at GTC, we’re celebrating – celebrating the amazing and impactful work that developers and startups are doing around the world. Nowhere is that more apparent than among the members of our global NVIDIA Inception program, designed to nurture cutting-edge startups who are revolutionizing industries. The program is free for startups of all sizes Read article >

The post NVIDIA Inception Introduces New and Updated Benefits for Startup Members to Accelerate Computing appeared first on NVIDIA Blog.