How does loss_function compare probabilities (if 10 classes then 10 probabilities) with actual labels (only a single value) while training? model prediction should not be probabilities, they should be actual predicted values. Does it map the highest probability by using max() of the class and output a single value?
That's a great question. The CrossEntropyLoss function takes in two things: the model's predicted probabilities and the correct label that we wanted the model to predict. The CrossEntropyLoss function then takes the negative logarithm of the model's predicted probability for the correct/expected label, and that is the loss for that data point. Since we are minimizing the negative log of the probability, that's the same as maximizing the likelihood of the model predicting the correct class. Then during testing, we take the max() as done at 10:21. Great question!
You're the boss man, boss man.
which video goes into the train/test data loader code?
How does loss_function compare probabilities (if 10 classes then 10 probabilities) with actual labels (only a single value) while training? model prediction should not be probabilities, they should be actual predicted values. Does it map the highest probability by using max() of the class and output a single value?
That's a great question. The CrossEntropyLoss function takes in two things: the model's predicted probabilities and the correct label that we wanted the model to predict. The CrossEntropyLoss function then takes the negative logarithm of the model's predicted probability for the correct/expected label, and that is the loss for that data point.
Since we are minimizing the negative log of the probability, that's the same as maximizing the likelihood of the model predicting the correct class.
Then during testing, we take the max() as done at 10:21. Great question!
Thank you for the clarification!
5:30