DataBloom - Part 343

Misc

Teens Develop Handwriting-Recognition AI for Detecting Parkinson’s Disease

Post author By
Post date March 29, 2022
No Comments on Teens Develop Handwriting-Recognition AI for Detecting Parkinson’s Disease

When Tanish Tyagi published his first research paper a year ago on deep learning to detect dementia, it started a family-driven pursuit. Great-grandparents in his family had suffered from Parkinson’s, a genetic disease that affects more than 10 million people worldwide. So the now 16-year-old turned to that next, together with his sister, Riya, 14. Read article >

The post Teens Develop Handwriting-Recognition AI for Detecting Parkinson’s Disease appeared first on NVIDIA Blog.

Misc

Build Speech AI in Multiple Languages and Train Large Language Models with the Latest from Riva and NeMo Megatron

Post author By
Post date March 28, 2022
No Comments on Build Speech AI in Multiple Languages and Train Large Language Models with the Latest from Riva and NeMo Megatron

Graphical representation of automatic speech recognition for transcription, controllable text-to-speech, and natural language processing in a chatbot. Read a recap of conversational AI announcements from NVIDIA GTC.

Major updates to Riva, an SDK for building speech AI applications, and a paid Riva Enterprise offering were announced at NVIDIA GTC 2022 last week. Several key updates to NeMo Megatron, a framework for training Large Language Models, were also announced.

Riva 2.0 general availability

Riva offers world-class accuracy for real-time automatic speech recognition (ASR) and text-to-speech (TTS) skills across multiple languages and can be deployed on-prem, in any cloud. Industry leaders such as Snap, T-Mobile, RingCentral, and Kore.ai use Riva in customer care center applications, transcription, and virtual assistants.

The latest Riva version includes:

ASR in multiple languages: English, Spanish, German, Russian, and Mandarin.
High-quality TTS voices customizable for unique voice fonts.
Domain-specific customization with TAO Toolkit or NVIDIA NeMo for unparalleled accuracy in accent, domain, and country-specific jargon.
Support to run in cloud, on-prem, and on embedded platforms.

A GIF showing how to control Riva text-to-speech pitch and speed using SSML tags. — *Figure 1: NVIDIA Riva controllable text-to-speech makes it easy to adjust pitch and speed using SSML tags.*

Try Riva automatic speech recognition on the Riva product page.

Defined.ai has collaborated with NVIDIA to provide a smooth workflow for enterprises looking to purchase speech training and validation data across languages, domains, and recording types. A sample of the DefinedCrowd dataset for NVIDIA developers can be found here.

Download Riva, which is available free for members of the NVIDIA Developer program from NGC.

Riva Enterprise

NVIDIA also introduced Riva Enterprise, a paid offering for enterprises deploying Riva at scale with business-standard support from NVIDIA experts.

Benefits include:

Unlimited use of ASR and TTS services on any cloud and on-prem platforms.
Access to NVIDIA AI experts during local business hours for guidance on configurations and performance.
Long-term support for maintenance control and upgrade schedule.
Priority access to new releases and features.

Riva Enterprise is available as a free trial on NVIDIA Launchpad for enterprises to evaluate and prototype their applications.

Riva Enterprise on launchpad includes guided labs to:

Interact with Real-Time Speech AI APIs.
Add Speech AI Capabilities to a Conversational AI Application.
Fine-Tune a Speech AI Pipeline on Custom Data for Higher Accuracy.

Apply for your Riva Enterprise trial.

Learn more about how to build, optimize, and deploy speech AI applications from the Conversational AI Demystified GTC session.

NeMo Megatron

NVIDIA announced new updates to NVIDIA NeMo Megatron, a framework for training large language models (LLM) up to trillions of parameters. Built on innovations from the Megatron paper, with NeMo Megatron research institutions and enterprises can train any LLM to convergence. NeMo Megatron provides data preprocessing, parallelism (data, tensor, and pipeline), orchestration and scheduling, and auto-precision adaptation.

It consists of thoroughly tested recipes, popular LLM architecture implementations, and necessary tools for organizations to quickly start their LLM journey.

AI Sweden, JD.com, Naver, and the University of Florida are early adopters of NVIDIA technologies for building large language models.

The latest version includes:

Hyperparameter tuning tool—automatically creates recipes based on customers’ needs and infrastructure limitations.
Reference recipes for T5 and mT5 models.
Support to train LLM on cloud, starting with Azure.
Distributed data preprocessing scripts to shorten end-to-end training time.

Apply for NeMo Megatron early access.

Learn more about interesting applications of LLMs and best practices to deploy them in the Natural Language Understanding in Practice: Lessons Learned from Successful Enterprise Deployments GTC session.

Misc

Is there any way to use tf 1.14 with rtx 3090

Post author By
Post date March 28, 2022
No Comments on Is there any way to use tf 1.14 with rtx 3090

Hi guys

ive worked on machine vision with tensorflow 1.14 and recently upgraded rtx2070s to rtx3090.

and seems like rtx3090 doesnt support tf<2.5 version as it needs cuda 11.0. ( cuz i met many strange errors that i havent seen on rtx2070s)

so i wanna know if it is still possible to use 1.14 version with rtx3090 or should i just give up and use recent tensorflow version ?

thank you

submitted by /u/TopYoo
[visit reddit] [comments]

Misc

Training and testing with embeddings in transfer learning

Post author By
Post date March 28, 2022
No Comments on Training and testing with embeddings in transfer learning

I am doing transfer learning with google audioset embeddings. According to the documentation,

the embedding layer does not include a final non-linear activation, so the embedding value is pre-activation

I want to train and test a new model on top of these embedding layer with the embedding data. I have planned to do the following

Create new dense layers.
Convert the embeddings from byte string to tensor. Split these embeddings to train, test and split dataset.
Input these tensors to the new model.
Validate and test the model using validate dataset and test dataset.

I have two confusions with this implementation

Is using the embeddings as input of the new layers enough for the transfer learning? I have seen in some Transfer Learning implementation that they load pre-trained weights to the new model and freeze the layers involving those weights. But in those implementation, they use new data for training, not the embeddings from the pre-trained model. I am confused how that works.
Is it okay to split the embeddings to train, test and validate dataset? I am not sure if all the embeddings were used for training the pre-trained model. If they all were used, then does it make sense to use part of them as validation and test dataset?

submitted by /u/sab_1120
[visit reddit] [comments]

Misc

Problem with training VAE

Post author By
Post date March 27, 2022
No Comments on Problem with training VAE

Hello.

I’m new in to tensorflow. Been trying recreate VAE from tutorial ( author does not respond for questions) but i keeps getting error while training network :

ERROR:

Traceback (most recent call last): File "...train.py", line 45, in <module> autoencoder = train(x_train, LEARNING_RATE, BATCH_SIZE, EPOCHS) #here is problem File "...train.py", line 36, in train autoencoder.compile(learning_rate) File "...autoencoder.py", line 61, in compile self.model.compile(optimizer=optimizer, File "...libsite-packagestensorflowpythontrainingtrackingbase.py", line 530, in _method_wrapper result = method(self, *args, **kwargs) File "...libsite-packagestensorflowpythonkerasenginetraining_v1.py", line 444, in compile self._cache_output_metric_attributes(metrics, weighted_metrics) File "...libsite-packagestensorflowpythonkerasenginetraining_v1.py", line 1800, in _cache_output_metric_attributes self._per_output_metrics = training_utils_v1.collect_per_output_metric_info( File "...libsite-packagestensorflowpythonkerasenginetraining_utils_v1.py", line 910, in collect_per_output_metric_info metric_fn._from_serialized = from_serialized # pylint: disable=protected-access AttributeError: 'method' object has no attribute '_from_serialized'

LINE 41-48 in train.py ( first two errors):

if __name__ == "__main__": x_train = load_fsdd(SPECTROGRAMS_PATH) autoencoder = train(x_train,LEARNING_RATE,BATCH_SIZE, EPOCHS) autoencoder.save("modelv1.0")

LINE 59-64 in autoencoder.py:

def compile(self, learning_rate=0.0001): optimizer = Adam(learning_rate=learning_rate) self.model.compile(optimizer=optimizer, loss=self._calculate_combined_loss, metrics=[self._calculate_reconstruction_loss, self._calculate_kl_loss])

I’m using PyCharm CE 2021.1.1 and tensorflow 2.3.1

submitted by /u/MonstrousPudding
[visit reddit] [comments]

Misc

Training and testing the embeddings in transfer learning

Post author By
Post date March 27, 2022
No Comments on Training and testing the embeddings in transfer learning

I am doing transfer learning with google audioset embeddings. The audioset corpus consists of pre-trained embeddings that are pre-activation (the final layers are removed).

While studying transfer learning, I have noticed that the checkpoint weights are used and the layers that produced these weights are frozen while the a model is built on top of the previous model.

Should I use the pre-trained weights of Audioset while using the embeddings of audioset itself for training the new model? It does not sound right as these embeddings are the bi product of these weights already. Please correct me if I am wrong.

submitted by /u/sab_1120
[visit reddit] [comments]

Misc

Mask RCNN dataset annotation tool

Post author By
Post date March 27, 2022
No Comments on Mask RCNN dataset annotation tool

I am using labelImg to annotate custom datasets for Faster RCNN. What tools do you use to annotate images for Mask RCNN – create the masks?

submitted by /u/giakou4
[visit reddit] [comments]

Misc

Is it possible to start TensorFlow from an intermediate layer?

Post author By
Post date March 26, 2022
No Comments on Is it possible to start TensorFlow from an intermediate layer?

Hello,

I was wondering

Lets say if I get the output of an intermediary layer, would it be possible to feed the data back into the intermediary layer and resume the processing only from that layer?

Just wondering…..

submitted by /u/lulzintosh123
[visit reddit] [comments]

Misc

Can any one tell me what are initial value of self.w made by Add_weight() function (when i=0) , i think they are zeros , note that units = 200 and input_dim = 128

Can any one tell me what are initial value of self.w made by Add_weight() function (when i=0) , i think they are zeros , note that units = 200 and input_dim = 128

submitted by /u/hichemito
[visit reddit] [comments]

Misc

Configuring Pipeline for MSCoco

Post author By
Post date March 25, 2022
No Comments on Configuring Pipeline for MSCoco

I have been training a model using MS Coco 2017 dataset.

The Coco dataset is divided between train, test and validation. The key difference is the test dataset does not have bounding box annotations.

I didn’t realise this and I set the eval_input _reader to point at the Coco test dataset .tfrecord files.

Is this incorrect? Should I instead point it towards the validation dataset which has the bounding-box annotations? It’s strange because my model is still working. Though not very well.

Very confused by it all. Why doesn’t the cocodataset label the test images?

submitted by /u/manduala
[visit reddit] [comments]