Dear Jeff, Thank you so much for explaining with such clarity. I have a problem and I'm right now kinda stuck. I have a multi-class classification problem for which I get a very high AUC score (>0.98) while the other metric scores (precision, recall, accuracy) are relatively low (~0.69). I have an explanation but I need someone to validate if what I am reasoning is really the case. So in case of multi-class problems if I want to use ROC I would need to binarize the class labels (since ROC works for binary class problems). Now for instance if I am using the one-vs-one configuration (Hand and Till, 2001), and the classifier distributes the mis-classifications (false positives and false negatives) uniformly across many (or even all) classes, I will inevitably have small number of false positives and small number of false negatives for any pair of classes. Consequently, less number of false positives and false negatives will amount to high true positive rate and low false positive rate and ergo a high AUC. Since this does not really mean the classifications are mostly correct you will get a low score on accuracy, precision or recall. Does that make sense? My dataset is balanced. I hope I have succeeded to some extend at explaining my problem and my understanding of it. I would really really appreciate if you could help me with resolving this problem. Thank you so much!
Thanks for the amazing content. I was wondering if should I apply other strategy to prevent overfitting or improve the accuracy or ROC performance, something like crossover? or using early stopping is enough?
On the ROC chart example, in the code, you have removed all the save line (checkpointer = ...) that are in the video. I'm having hard time to save anything on my Windows drive. Is it even possible to save those best 'weights' from jupyter notebook to our drive?
I have tried this from previous videos: checkpoint_path = "C:/Users/lalan/Tensor/data/bestmodels/best_weight.h5" checkpointer = model.save(checkpoint_path) model.load_weights(checkpoint_path) # Load weights from best model. It did save, and gave no error on load. But later, when I tried to plot_roc ValueError: continuous format is not supported
Great Video! I have a question: since we use ROC to measure binary classification, why not use metrics = ['roc_auc'] in the compile part instead of ['accuracy']? Which metrics to put there is better? - model.compile(loss='binary_crossentropy', optimizer=tensorflow.keras.optimizers.Adam(), metrics =['accuracy'])
For the first network when you classify the type of cancer why do you use a linear activation function on the output layer. When you have binary classification I thought you should use Sigmoid activation function. Can you please clarify?
In the example where y holds diagnosis values, it needs to be converted to NumPy, by adding .values y = df['diagnosis'].map({'M':1,"B":0}).values # Binary classification, M is 1 and B is 0 Otherwise training fails.
It's really frustrating when you do not scroll in the code side wise. Yes you have said that you are not interested in explaining the code but for someone who is a beginner and has no idea how the code is written this video is a waste of time because it does not even show the arguments completely.
My neural network classifies this video as excellent!
Dear Jeff,
Thank you so much for explaining with such clarity.
I have a problem and I'm right now kinda stuck. I have a multi-class classification problem for which I get a very high AUC score (>0.98) while the other metric scores (precision, recall, accuracy) are relatively low (~0.69). I have an explanation but I need someone to validate if what I am reasoning is really the case.
So in case of multi-class problems if I want to use ROC I would need to binarize the class labels (since ROC works for binary class problems). Now for instance if I am using the one-vs-one configuration (Hand and Till, 2001), and the classifier distributes the mis-classifications (false positives and false negatives) uniformly across many (or even all) classes, I will inevitably have small number of false positives and small number of false negatives for any pair of classes. Consequently, less number of false positives and false negatives will amount to high true positive rate and low false positive rate and ergo a high AUC. Since this does not really mean the classifications are mostly correct you will get a low score on accuracy, precision or recall. Does that make sense?
My dataset is balanced.
I hope I have succeeded to some extend at explaining my problem and my understanding of it.
I would really really appreciate if you could help me with resolving this problem. Thank you so much!
Thanks for the amazing content. I was wondering if should I apply other strategy to prevent overfitting or improve the accuracy or ROC performance, something like crossover? or using early stopping is enough?
I love your videos! They are super helpful!
On the ROC chart example, in the code, you have removed all the save line (checkpointer = ...) that are in the video. I'm having hard time to save anything on my Windows drive. Is it even possible to save those best 'weights' from jupyter notebook to our drive?
I have tried this from previous videos:
checkpoint_path = "C:/Users/lalan/Tensor/data/bestmodels/best_weight.h5"
checkpointer = model.save(checkpoint_path)
model.load_weights(checkpoint_path) # Load weights from best model.
It did save, and gave no error on load. But later, when I tried to plot_roc
ValueError: continuous format is not supported
Great Video! I have a question: since we use ROC to measure binary classification, why not use metrics = ['roc_auc'] in the compile part instead of ['accuracy']? Which metrics to put there is better?
- model.compile(loss='binary_crossentropy',
optimizer=tensorflow.keras.optimizers.Adam(),
metrics =['accuracy'])
Why don't you calculate the ROC curve for the multiclass classification?
Thanks. I wasn't aware that a binary_classification could be used with a single y -- testing that now. :-)
For the first network when you classify the type of cancer why do you use a linear activation function on the output layer. When you have binary classification I thought you should use Sigmoid activation function. Can you please clarify?
Great Prof..Thank You
In the example where y holds diagnosis values, it needs to be converted to NumPy, by adding .values
y = df['diagnosis'].map({'M':1,"B":0}).values # Binary classification, M is 1 and B is 0
Otherwise training fails.
what does: kernel_initializer='random_normal' do?
at 1:19 you have your ending advrtisement covering other things you are saying
It's really frustrating when you do not scroll in the code side wise. Yes you have said that you are not interested in explaining the code but for someone who is a beginner and has no idea how the code is written this video is a waste of time because it does not even show the arguments completely.
Maybe click on the github link instead of trying to copy code from a video?