Categories
Misc

Export fine-tuned BERT Model trained on Cloud TPU to HDF5 format

I’m using Colab environment to fine-tune a BERT Model (for
reference this is the
Notebook
_with_Cloud_TPU_Sentence_Classification_Tasks.ipynb)).
How can I export fine tuned model (it’s a TPUEstimator object) to
HDF5 format? I need to use the trained model locally on CPU.

submitted by /u/spaceape__

[visit reddit]

[comments]

Categories
Misc

Possibly serious issue with tf.image.per_image_standarization

I came across this issue in my own projects, and found the
issue
linked here
on the TensorFlow github, but I feel like it isn’t
getting much traction for the potential severity of the
problem.

Basically there was a non-release push to TF between 1.14 and
1.15 that broke some functionality for the
tf.image.per_image_standarization routine when used on unsigned
integer inputs. The majority of information content in images ends
up getting lost because of the naïve type conversions done in
per_image_standardization after 1.14. This isn’t addressed in
documentation, and is pretty clearly a major change in behavior
befitting a major release, but was introduced before a major
release, likely pointing to an untested edge case.

I’m concerned that the issue isn’t getting much traction but
could potentially impact labs all over the place. The simple
solution is to convert your unsigned int images to float before
calling per_image_standardization, but that isn’t obvious from any
of the documentation, and used to be handled naturally by the
method.

Thoughts?

Edit: formatting.

submitted by /u/DrSparkle713

[visit reddit]

[comments]

Categories
Misc

[100% OFF] Object Detection Web App with TensorFlow, OpenCV and Flask


[100% OFF] Object Detection Web App with TensorFlow, OpenCV and Flask
submitted by /u/codeeuler1

[visit reddit]

[comments]
Categories
Misc

Model training stalls forever after just a few batches.

I posted
this as an issue on Github
, maybe someone here will have a
magic solution:

  • TensorFlow version: 2.4.0-rc4 (also tried with stable
    2.4.0)
  • TensorFlow Git version: v2.4.0-rc3-20-g97c3fef64ba
  • Python version: 3.8.5
  • CUDA/cuDNN version: CUDA 11.0, cuDNN 8.0.4
  • GPU model and memory: Nvidia RTX 3090, 24GB RAM

Model training regularly freezes for large models.

Sometimes the first batch or so works, but then just a few
batches later and training seems stuck in a loop. From my activity
monitor, I see GPU CUDA use hovering around 100%. This goes on for
minutes or more, with no more batches being trained.

I don’t see an OOM error, nor does it seem like I’m hitting
memory limits in activity monitor or nvidia-smi.

I would expect the first batch to take a bit longer, then any
subsequent batches to take less than <1s. Never have a random
batch take minutes or stall forever.

Run through all the cells in the notebook shared below to
initialize the model, then run the final cell just a few times.
Eventually it will hang and never finish.


https://github.com/not-Ian/tensorflow-bug-example/blob/main/tensorflow%20error%20example.ipynb

Smaller models train quickly as expected, however I think even
then they eventually stall out after training many, many batches. I
had another similar, small VAE like in my example that trained for
5k-10k batches overnight before stalling.

Someone suggested I set a hard memory limit on the GPU like
this:

gpus = tf.config.experimental.list_physical_devices('GPU') tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024 * 23)]) 

And finally, I’ve tried using the hacky ptxas.exe file from CUDA
11.1 in my CUDA 11.0 installation. This seems to remove a warning?
But still no change.

Open to any other ideas, thanks.

submitted by /u/Deinos_Mousike

[visit reddit]

[comments]

Categories
Misc

newbie here^^ ;trying to build tensorflow to old gpu

i’ve geforce 840m. it is cuda 5.0.My project has dependencies as
tensorflow ,opencv ,cuda 7.5+ and cudnn 5.0+.(https://github.com/dvschultz/neural-style-tf)

i keep getting this error

“W tensorflow/stream_executor/platform/default/dso_loader.cc:59]
Could not load dynamic library ‘cudart64_101.dll’; dlerror:
cudart64_101.dll not found”

tensorflow doesnt see my gpu.

1-is it because i’ve higher cuda version than my gpu?

2-is it because my tensorflow version 2.3.1 ?

thx.

submitted by /u/elyakubu

[visit reddit]

[comments]

Categories
Misc

Inception to the Rule: AI Startups Thrive Amid Tough 2020

2020 served up a global pandemic that roiled the economy. Yet the startup ecosystem has managed to thrive and even flourish amid the tumult. That may be no coincidence. Crisis breeds opportunity. And nowhere has that been more prevalent than with startups using AI, machine learning and data science to address a worldwide medical emergency Read article >

The post Inception to the Rule: AI Startups Thrive Amid Tough 2020 appeared first on The Official NVIDIA Blog.

Categories
Misc

Shifting Paradigms, Not Gears: How the Auto Industry Will Solve the Robotaxi Problem

A giant toaster with windows. That’s the image for many when they hear the term “robotaxi.” But there’s much more to these futuristic, driverless vehicles than meets the eye. They could be, in fact, the next generation of transportation. Automakers, suppliers and startups have been dedicated to developing fully autonomous vehicles for the past decade, Read article >

The post Shifting Paradigms, Not Gears: How the Auto Industry Will Solve the Robotaxi Problem appeared first on The Official NVIDIA Blog.

Categories
Misc

Role of the New Machine: Amid Shutdown, NVIDIA’s Selene Supercomputer Busier Than Ever

And you think you’ve mastered social distancing. Selene is at the center of some of NVIDIA’s most ambitious technology efforts. Selene sends thousands of messages a day to colleagues on Slack. Selene’s wired into GitLab, a key industry tool for tracking the deployment of code, providing instant updates to colleagues on how their projects are Read article >

The post Role of the New Machine: Amid Shutdown, NVIDIA’s Selene Supercomputer Busier Than Ever appeared first on The Official NVIDIA Blog.

Categories
Misc

AI at Your Fingertips: NVIDIA Launches Storefront in AWS Marketplace

AI is transforming businesses across every industry, but like any journey, the first steps can be the most important. To help enterprises get a running start, we’re collaborating with Amazon Web Services to bring 21 NVIDIA NGC software resources directly to the AWS Marketplace. The AWS Marketplace is where customers find, buy and immediately start Read article >

The post AI at Your Fingertips: NVIDIA Launches Storefront in AWS Marketplace appeared first on The Official NVIDIA Blog.

Categories
Misc

How can I train a model on a HUGE dataset?

So I have a huge dataset that devours my 32GB memory and then
crashes every time before I can even begin training. Is it possible
to break the dataset into chunks and train my model that way?

I’m fairly new to tensorflow so I’m not sure how to go about it.
Can anyone help?

Thank you.

EDIT: the data is time series data (from a csv) that I’m loading
into a pandas dataframe. From there, the data is being broken up
into samples with a 10 step window. I have about 90M samples with
the shape (90M, 10, 1) that should then be fed into the LSTM. The
problem is that the samples crash the RAM and I have to start all
over again each time.

submitted by /u/dsm88

[visit reddit]

[comments]