Hi, I was wondering if you had a video on what's the intuitive behind predict_proba() function for KNN? Also, what's the math behind it or how it's computed ? Since knn is non parametric . Thnx
Good question. I think I mentioned it in one of the upcoming videos (I recall talking about that, but don't remember exactly where ;)). Anyways, the predict_proba method for KNNs is relatively simple. It's just the proportion of the majority class. E.g., if you have a binary classification problem with k=5, and there are 3 examples from class 1 and 2 examples from class 2, then the predict_proba method would return 3/5 = 0.6
@@SebastianRaschka thanks for your quick response. So if I understand correctly, it's not really a probability like something we would get from the logistics regression?
@@maxpandora995 Yes and no. Technically, if the dataset and k are sufficiently large, then you can think of it as a probability density. I have a note about that here sebastianraschka.com/pdf/lecture-notes/stat451fs20/02-knn__notes.pdf on page 19.
@@SebastianRaschka thanks for the explanation and notes. I was curious if uncertainties can be associated with the probability so with the eg above 3/5= 0.6 +/- some uncertainty.
@@SebastianRaschka had another question pls, I noticed that some of the outputs are decimals with predict_proba. For eg. 79.28 or 37.53, how are these generated ?
Hi, I was wondering if you had a video on what's the intuitive behind predict_proba() function for KNN? Also, what's the math behind it or how it's computed ? Since knn is non parametric . Thnx
Good question. I think I mentioned it in one of the upcoming videos (I recall talking about that, but don't remember exactly where ;)). Anyways, the predict_proba method for KNNs is relatively simple. It's just the proportion of the majority class. E.g., if you have a binary classification problem with k=5, and there are 3 examples from class 1 and 2 examples from class 2, then the predict_proba method would return 3/5 = 0.6
@@SebastianRaschka thanks for your quick response. So if I understand correctly, it's not really a probability like something we would get from the logistics regression?
@@maxpandora995 Yes and no. Technically, if the dataset and k are sufficiently large, then you can think of it as a probability density. I have a note about that here sebastianraschka.com/pdf/lecture-notes/stat451fs20/02-knn__notes.pdf on page 19.
@@SebastianRaschka thanks for the explanation and notes. I was curious if uncertainties can be associated with the probability so with the eg above 3/5= 0.6 +/- some uncertainty.
@@SebastianRaschka had another question pls, I noticed that some of the outputs are decimals with predict_proba. For eg. 79.28 or 37.53, how are these generated ?