Optimizing TF.keras ML binary classifier for specific health task

Hello everyone, I am looking for advice on a specific task: optimizing the result of a binary classifier.

I have real-world DBs of health data from two centers regarding ER patients with a specific pathology and I am trying to see if i can distinguish between high grade/emergency cases and low grade/less urgent cases. It’s not an easy task, even for medical professionals. I have a number of parameters, both boolean (medical history, symptoms) and continuous (labs) that have been shown to be somewhat statistically relevant in distinguishing the two, as well as labels for each. Positivity is about 5% of the total, which is about 1000+200 cases. The data is incomplete, though, as not all clues are available for every patient. I am planning to use the larger DB for train/test/initial validation and the second, smaller DB as my real-world validation.

I set up a Sequential model using some online tutorials, though it is unclear to me what guides the choice of layer numbers/type/activation/etc. My current setup, after some fiddling, is the following (abridged, obviously):
epochs= 100
TEST_SIZE = 0.25
x_train, x_test, y_train, y_test = train_test_split( X, Y, test_size=TEST_SIZE)
# define the model
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(10, input_dim=x_train.shape[1], activation=’relu’, kernel_initializer=’he_normal’))
model.add(tf.keras.layers.Dense(1, activation=’sigmoid’))

# compile the keras model
metrics=[tf.keras.metrics.RecallAtPrecision(0.95), tf.keras.metrics.Accuracy(), tf.keras.metrics.Recall(), tf.keras.metrics.Precision()])

# Fit model on training data
history =, y_train, validation_split=0.33, batch_size=16, epochs=EPOCHS)

Now, this is not a time sensitive operation, so I would like to get the best result possible. I would also like not to miss positive cases if I can help it, within reason. I suppose I could add in a “maybe” category if it increases accuracy significantly on the other two categories.
This is what I am at right now (it varies every run, sometimes significantly):

Original Set
RecallAtPrecision: 0.00
Accuracy: 0.24
recall – specificity: 60.00
precision – sensibility: 50.00

RecallAtPrecision: 8.00
Accuracy: 0.00
recall – specificity: 52.00
precision – sensibility: 76.47

Now, to the issues:

  1. Why accuracy is so low, while sensibility/specificity are not that bad?
  2. What can i do to make things better? More or less layers? More or less nodes? More epochs? Different setup altogether? Would it make sense to automate the search for the best parameters somehow?
  3. Any general advice on the matter? Any accessible resources that I could look up? (please note that I am a heath professional, not a software developer. I only learned python and a few rudiments of ML in the last few months to see if I can automate stuff like this).

Thank you in advance for your kind responses. I am counting on you to illuminate my way forward

submitted by /u/N0Br41nZ
[visit reddit] [comments]

Leave a Reply

Your email address will not be published.