DataBloom - Part 368

Misc

uploading app with TensorFlow filter to Heroku cause memory exceeded

Post author By
Post date February 14, 2022
No Comments on uploading app with TensorFlow filter to Heroku cause memory exceeded

Recently I’ve uploaded an application to Heroku with TensorFlow,

the problem is the basic memory that you get from Heroku is 512MB,

which causes the server to crash after loading images into the TensorFlow model when the images are high quality.

Process running mem=934M(182.5%)

Error R14 (Memory quota exceeded)

Process running mem=1108M(216.5%)

Error R15 (Memory quota vastly exceeded)

Then, a few times I exceeded over 1.2GB when using this model, so I bought 1GB.And then it used over 2.2GB

CODE:

import tensorflow_hub

import tensorflow

import numpy

import cv2

model = hub.load(‘https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2‘)

def load_image(img_path):

img = tensorflow.io.read_file(img_path)

img = tensorflow.image.decode_image(img, channels=3)

img = tensorflow.image.convert_image_dtype(img, tf.tensorflow)

img = img[tensorflow.newaxis, :]

return img

def stylize_img(content_image_path,style_image_path):

content_image = load_image(content_image_path)

style_image = load_image(style_image_path)

cv2.imwrite(‘styled_img.jpg’, cv2.cvtColor(np.squeeze(stylized_image)*255, cv2.COLOR_BGR2RGB))

img = tensorflow.io.read_file(‘styled_img.jpg’)

remove(‘styled_img.jpg’)

return img

how can I solve the memory issue?

submitted by /u/Open-Glove-5238
[visit reddit] [comments]

Misc

Map image to vector data using deep learning

Post author By
Post date February 14, 2022
No Comments on Map image to vector data using deep learning

I’m working on a project where basically I would like to take a relatively small binary image and train a DL model to produce a vector of 360 outputs, one for each angle (i.e. the amplitude response at every angle produced by the image in question). I’m trying to figure out how to best build this from a model perspective. I think some convolutional layers might make sense, but I’m not sure whether I should just build a dense layer with 360 outputs, one for each angle, or something else entirely (perhaps a convolutional AE or RNN/LSTM or something like that, since we’re looking at sequence data?) And I’m not sure how the fact angles are involved might change anything either. Any ideas are appreciated!

submitted by /u/engndesign74
[visit reddit] [comments]

Misc

Weird Assertion Error: AssertionError: Duplicate registrations for type ‘experimentalOptimizer’

Post author By
Post date February 13, 2022
No Comments on Weird Assertion Error: AssertionError: Duplicate registrations for type ‘experimentalOptimizer’

Hi, I am working on a simple ML project and I am getting an error I can’t seem to resolve. I am getting: AssertionError: Duplicate registrations for type ‘experimentalOptimizer’ when I try to run any TensorFlow program that has an optimizer. I uninstalled Python, TensorFlow, and Keras and then reinstalled all three. The Adam optimizer file is present as well as several other optimizers. I couldn’t find any other documentation on this error elsewhere so any help would be greatly appreciated, thanks.

submitted by /u/MartinW1255
[visit reddit] [comments]

Offsites

Oh, wait, actually the best Wordle opener is not “crane”…

Post author By
Post date February 13, 2022
No Comments on Oh, wait, actually the best Wordle opener is not “crane”…

Misc

How to upload files in a trained model to use it?

Post author By
Post date February 13, 2022
No Comments on How to upload files in a trained model to use it?

Hi,

The title says the question. Here is the code which works in a colab. How can I convert it to make it work in the local env?:

import numpy as np
from google.colab import files
from keras.preprocessing import image
uploaded = files.upload()
for fn in uploaded.keys():

# predicting images
path = ‘/content/’ + fn
img = image.load_img(path, target_size=(300, 300))
x = image.img_to_array(img)
x = x / 255
x = np.expand_dims(x, axis=0)
images = np.vstack([x])
classes = model.predict(images, batch_size=10)
print(classes[0])
if classes[0]>0.5:
print(fn + ” is a human”)
else:
print(fn + ” is a horse”)

submitted by /u/StarLan7
[visit reddit] [comments]

Misc

I can’t seem to get Tensor Flow working on my 3070 for NLP or CNN

Post author By
Post date February 12, 2022
No Comments on I can’t seem to get Tensor Flow working on my 3070 for NLP or CNN

I have been using tensor flow for a while now but I just recently ran into a problem with one of my programs. While trying to create a convolutional network I got the error

Epoch 1/4 Process finished with exit code -1073740791 (0xC0000409)

where I have never had this error before. I have all of the updated CUDA’s and CUDD’s and have them in the right folder so I don’t know what the problem is. Anything helps thanks.

#from keras.datasets import imdb from keras.preprocessing import sequence import tensorflow as tf VOCAB_SIZE = 88584 MAXLEN = 250 BATCH_SIZE = 64 (train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=VOCAB_SIZE) train_data = sequence.pad_sequences(train_data, MAXLEN) test_data = sequence.pad_sequences(test_data, MAXLEN) model = tf.keras.Sequential([ tf.keras.layers.Embedding(VOCAB_SIZE, 32), # Graph vector form, 32 dimensions tf.keras.layers.LSTM(32), # Long-Short term memory tf.keras.layers.Dense(1, activation="sigmoid") # Between 0-1 ]) model.compile(optimizer=tf.keras.optimizers.RMSprop(), loss="mean_squared_error", metrics=[tf.keras.metrics.RootMeanSquaredError()]) history = model.fit(x=train_data, y=train_labels, batch_size=128, epochs=10)

submitted by /u/Cheif_Cheese
[visit reddit] [comments]

Misc

📢 New Course on TensorFlow and Keras by OpenCV

Post author By
Post date February 12, 2022
No Comments on 📢 New Course on TensorFlow and Keras by OpenCV

📢 New Course on TensorFlow and Keras by OpenCV

submitted by /u/spmallick
[visit reddit] [comments]

Misc

2022-02-11 16:51:01.357924: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1900] Ignoring visible gpu device (device: 0, name: GeForce 820M, pci bus id: 0000:04:00.0, compute capability: 2.1) with Cuda compute capability 2.1. The minimum required Cuda capability is 3.5.

Hello everyone.

Is there a way to bypass this without having to only use my CPU?

Thanks

submitted by /u/dalpendre
[visit reddit] [comments]

Offsites

An International Scientific Challenge for the Diagnosis and Gleason Grading of Prostate Cancer

Post author By
Post date February 11, 2022
No Comments on An International Scientific Challenge for the Diagnosis and Gleason Grading of Prostate Cancer

Posted by Po-Hsuan Cameron Chen, Software Engineer, Google Health and Maggie Demkin, Program Manager, Kaggle

In recent years, machine learning (ML) competitions in health have attracted ML scientists to work together to solve challenging clinical problems. These competitions provide access to relevant data and well-defined problems where experienced data scientists come to compete for solutions and learn new methods. However, a fundamental difficulty in organizing such challenges is obtaining and curating high quality datasets for model development and independent datasets for model evaluation. Importantly, to reduce the risk of bias and to ensure broad applicability of the algorithm, evaluation of the generalisability of resulting algorithms should ideally be performed on multiple independent evaluation datasets by an independent group of scientists.

One clinical problem that has attracted substantial ML research is prostate cancer, a condition that 1 in 9 men develop in their lifetime. A prostate cancer diagnosis requires pathologists to examine biological tissue samples under a microscope to identify cancer and grade the cancer for signs of aggressive growth patterns in the cells. However, this cancer grading task (called Gleason grading) is difficult and subjective due to the need for visual assessment of cell differentiation and Gleason pattern predominance. Building a large dataset of samples with expert annotations can help with the development of ML systems to aid in prostate cancer grading.

To help accelerate and enable more research in this area, Google Health, Radboud University Medical Center and Karolinska Institutet joined forces to organize a global competition, the Prostate cANcer graDe Assessment (PANDA) Challenge, on the open Kaggle platform. In “Artificial Intelligence for Diagnosis and Gleason Grading of Prostate Cancer: the PANDA challenge”, published in Nature Medicine, we present the results of the challenge. The study design of the PANDA challenge provided the largest public whole-slide image dataset available and was open to participants from April 21st until July 23rd, 2020. The development datasets remain available for further research. In this effort, we compiled and publicly released a European cohort of prostate cancer cases for algorithm development and pioneered a standardized evaluation setup for digital pathology that enabled independent, blinded external validation of the algorithms on data from both the United States and EU.

The global competition attracted participants from 65 countries (the size of the circle for each country illustrates the number of participants).

Design of the Panda Challenge
The challenge had two phases: a development phase (i.e., the Kaggle competition) and a validation phase. During the competition, 1,290 developers from 65 countries competed in building the best performing Gleason grading algorithm, having full access to a development set for algorithm training. Throughout the competition teams submitted algorithms that were evaluated on a hidden tuning set.

In the validation phase, a selection of top performing algorithms were independently evaluated on internal and external validation datasets with high quality reference grades from panels of expert prostate pathologists. In addition, a group of general pathologists graded a subset of the same cases to put the difficulty of the task and dataset in context. The algorithms submitted by the teams were then compared to grades done by groups of international and US general pathologists on these subsets.

Overview of the PANDA challenge’s phases for development and validation.

Research Velocity During the Challenge
We found that a group of Gleason grading ML algorithms developed during a global competition could achieve pathologist-level performance and generalize well to intercontinental and multinational cohorts. On all external validation sets, these algorithms achieved high agreement with urologic pathologists (prostate specialists) and high sensitivity for detecting tumor in biopsies. The Kaggle platform enabled the tracking of teams’ performance throughout the competition. Impressively, the first team achieving high agreement with the prostate pathologists at above 0.90 (quadratically weighted Cohen’s kappa) on the internal validation set occurred within the first 10 days of the competition. By the 33rd day, the median performance of all teams exceeded a score of 0.85.

Progression of algorithms’ performances throughout the competition, as shown by the highest score on the tuning and internal validation sets among all participating teams. During the competition teams could submit their algorithm for evaluation on the tuning set, after which they received their score. At the same time, algorithms were evaluated on the internal validation set, without disclosing these results to the participating teams. The development of the top score obtained by any team shows the rapid improvement of the algorithms.

Learning from the Challenge
By moderating the discussion forum on the Kaggle platform, we learned that the teams’ openness in sharing code via colab notebooks led to rapid improvement across the board, a promising sign for future public challenges, and a clear indication of the power of sharing knowledge on a common platform.

Organizing a public challenge that evaluates algorithm generalization across independent cohorts using high quality reference standard panels presents substantial logistical difficulties. Assembling this size of a dataset across countries and organizations was a massive undertaking. This work benefited from an amazing collaboration between the three organizing institutions which have all contributed respective publications in this space, two in Lancet Oncology and one in JAMA Oncology. Combining these efforts provided a high quality foundation on which this competition could be based. With the publication, Radboud and Karolinska research groups are also open sourcing the PANDA challenge development datasets to facilitate the further improvement of prostate Gleason grading algorithms. We look forward to seeing many more advancements in this field, and more challenges that can catalyze extensive international knowledge sharing and collaborative research.

Acknowledgements
Key contributors to this project at Google include Po-Hsuan Cameron Chen, Kunal Nagpal, Yuannan Cai, David F. Steiner, Maggie Demkin, Sohier Dane, Fraser Tan, Greg S. Corrado, Lily Peng, Craig H. Mermel. Collaborators on this project include Wouter Bulten, Kimmo Kartasalo, Peter Ström, Hans Pinckaers, Hester van Boven, Robert Vink, Christina Hulsbergen-van de Kaa, Jeroen van der Laak, Mahul B. Amin, Andrew J. Evans, Theodorus van der Kwast, Robert Allan, Peter A. Humphrey, Henrik Grönberg, Hemamali Samaratunga, Brett Delahunt, Toyonori Tsuzuki, Tomi Häkkinen, Lars Egevad, Masi Valkonen, Pekka Ruusuvuori, Geert Litjens, Martin Eklund and the PANDA Challenge consortium. We thank Ellery Wulczyn, Annisah Um’rani, Yun Liu, and Dale Webster for their feedback on the manuscript and guidance on the project. We thank our collaborators at NMCSD, particularly Niels Olson, for internal re-use of de-identified data which contributed to the US external validation set. Sincere appreciation also goes to Sami Lachgar, Ashley Zlatinov, and Lauren Winer for their feedback on the blogpost.

Misc

Slow initialization of model with dynamic batch size in the C API

Post author By
Post date February 11, 2022
No Comments on Slow initialization of model with dynamic batch size in the C API

I’m experiencing very slow loading times (5 minutes) with an EfficientNet architecture I converted from PyTorch (PT -> ONNX_TF -> TF).

The problem is only present when I load the model with the `v1.compat` mode or with the C API, which is my final goal. However, it loads fast in the standard TF v2 mode In Python (< 1s). After loading the models, the inference seems equally fast and correct in all the cases. I’m using `v1.compat` only for debugging, as it seems to behave similar to the C API.

I’ve noticed that the issue disappears when I export the model from PyTorch with a fixed batch size of 1, however I would prefer to have dynamic batch size.

I created a topic in the TensorFlow forum with access to the models, and details to reproduce.

I’m looking for ideas on what could be the issue, and if the resulting SavedModels could be modified in a way the loading is as fast in the C API as in TF v2.

submitted by /u/pablo_alonso
[visit reddit] [comments]