Help setting up tf-GPU and cuDNN.

I am trying to get my GPU to train. Gtx 1660 ti, tf 2.4.1, cuda
11.2, python 3.8.7

My NN was taking 15 minutes per epoch on some dummy data so I am
setting up GPU training. At one point I got through 13 epochs
before it got stuck (maybe ran out of memory?). Many github
resolutions later I am stuck at one of two errors:

CUBLAS_STATUS_ALLOC_FAILED CUDNN_STATUS_EXECUTION_FAILED

The only tickets I have found online have been resolved by
setting memory limit or setting “allow_growth” to true. Twice this
has gotten me past the first error, but isnt not working
consistently. Ultimately I end up at the second error either
way.

Has anyone encounter this and not had the widely reported
solution work? Thanks in advance if anyone can help me. Just spent
waaaaay too long trying to get this going and finally am out of
ways to google.

submitted by /u/skeerp

[visit reddit]
[comments]

Leave a Reply Cancel reply