00:27 The main topics of discussion are ridge and lasso regression, logistic regression, and the confusion matrix. 08:25 Overfitting and underfitting are two conditions that affect model accuracy. 22:28 L2 regularization adds a unique parameter or another sample value to minimize the cost function. 27:53 Ridge regularization is used to prevent overfitting by creating a generalized model. 39:15 Preventing overfitting and feature selection are the key purposes of ridge and lasso regression. 45:08 Logistic regression is a classification algorithm. 56:09 Logistic regression is used for binary classification problems with a decision boundary. 1:01:56 Logistic regression is used to create a sigmoid curve that helps in binary classification 1:13:03 Logistic regression cost function has specific equations for y=1 and y=0. 1:18:35 Logistic regression cost function and convergence algorithm 1:31:22 Calculation of basic accuracy and imbalanced data 1:37:06 The main aim of recall is to identify true positives. 1:48:48 F-score is calculated based on the value of beta Crafted by Merlin AI.
Super explanation of Ridge regression. Fundamentally its to prevent overfitting. Because cost is getting non zero. Algorithm tries to optimize the slope value. Ek teer do nishan Prevent overfit and slope is optimized due to new line
Hi Krish, Is the below steps was correct for regression problem. 1. In linear Regression Model first we will do EDA, Feature Engineering, Data Pre-processing and will split data into Train and Test. 2. Creating model using Linear Regression and evaluate the model like finding Loss and R2 Square. 3. If we could see more Loss then we have to do optimization using gradient decent and stochastic gradient decent for minimizing the Loss 4. Finally we have to check Bias and Variance trade-off if model getting overfitting then use L1 regularisation for preventing overfitting and L2 regularisation for preventing overfitting and feature selection as well. Thanks,
Low Bias, High Variance (Overfitting): When a model has low bias and high variance, it means that the model is able to fit the training data very well (low bias), but it is overly sensitive to the specific training examples and may not generalize well to new, unseen data (high variance). Overfitting is characterized by capturing noise or random fluctuations in the training data. To find an optimal model, there is a trade-off between bias and variance. The goal is to strike a balance that minimizes both bias and variance, leading to a model that generalizes well to new data. Techniques such as regularization and cross-validation are commonly used to address overfitting and find a suitable compromise between bias and variance.
I think after your 7 days series on ML , DL, EDA, time series, we can participate in kaggle competition. This would be the most efficient way to learn data science ! Hope you can do the series for DL and EDA too !
ML 1 st session has 247K views.....But this 2 nd session has only 34K only. That is very bad. Peoples always loved to start anything. But after that they hate to continue those things. They didn't hold it. That's why peoples don't get that much of job offers and fail on interviews.
14:32 Correction.. Underfitting occurs if the model or algorithm shows low variance but high bias (to contrast the opposite, overfitting from high variance and low bias). I
When I read about Linear Regression, I always see mentioned Ordinary Least Square as the most used algorithm to find the thetas parameters. Why didn't Krish mention it? Is it not important? Can anyone explain?
Please don't confuse learners, model should follow normal distribution is wrong. It is "Residuals should have normal distrbution". In linear regression errors are assumed to follow normal a normal distribution with a mean of zero.
when high Bias and High variance then predictions will be inconsistent and not accurate, Low bias and Low variance is an Ideal Model always.. Low Bias High Variance: Over fitting High Bias Low Variance :Under fitting
High bias High Variance: Underfitting. If the model performs poorly on train data, how will it perform good on test data? Clearly the model will not be able to generalise well.
I have a doubt that he mentioned that lasso will do feature selection and ridge can't. The explanation he had given on that in ridge while squaring the slope it will increase but not in lasso... My doubt is if the feature is not important then its slope will be less than One. Then its square will again going to be so small...Its not going to increase... Then how slope ridge is not ineffective to feature selection...It should give more better result than lasso in that case...
If we square the less significant coefficients then it would be much better as the square value would reduce it further then according to this particular scenario ridge is better right
there was a small mistake in the explanation for lasso or L1 regression we are suppose to sum the mod of the slope not the mod of sum of slopes. both are different in video you wrote | theta0 + theta1 + theta2 + theta3 + theta4 + ... + theta_n | but in actual the L1 norm should be |theta0|+|theta1|+ |theta2| + |theta3| + ...+ |theta_n| hope u get my point Thank you
There is big myth that normality assumption is for dependent feature But reality is Normality assumption is for residual (error) not for features Because if residual follow normal then its sum follow chisqure and then and then only ratio of msr/mse will follow f distribution
Overfitting: Good performance on the training data, poor generliazation to other data (low bias but high variance). Underfitting: Poor performance on the training data and poor generalization to other data ( high bias and high variance).
Hi Krish, I am not able to get into community forum to get this pdf file which you have written during the course. Are the documents removed from community forum.
Hy sir, my dataset containing 297 features and 9 types of prediction and results with Logistic regressions are low, why is it not a binary formate outcome so results are poor????
Assumption of linear regression Linearity normality of error Independence of error No autocorrelation Homoscedasticity residual variance equal and mean of residual equal to 0
Just published by @Krish Naik, new video describing Lasso and ElasticNet: th-cam.com/video/qbJKrlOxlJA/w-d-xo.html - with helpful numerical examples of how feature selection works in Lasso.
Hi guys, asking this for a requirement I’m working on, how to reduce the false positives in my model? I’m getting 1700 positive predictions out of which the actual positives is 46. It would be great if someone help me. Thanks in advance!
Reduce the threshold or cutt off criteria for example, if probability is greater than .5 then y=1. Change it to .4 then .3. This will reduce your FP's but these will be rearranged somewhere mostly into FN's.
Bcoz eventually it's predicting the probability of the dependent variable for a particular class, and hence the output is a continuous variable. Thus it's called Logistic regression
Bias relates to training data accuracy and Variance relates to testing data accuracy so when we get low accuracy on training data we get High Bias means the data is not fitted correctly similarly when we get low accuracy on testing data we get high variance which means the prediction is not accurate Hope the explanation helps..
Linear Ridge, Lasso And Logistic Regression:
-------------------------------------------------------------------------
Part I:
-------------------------------------------------------------------------
Agenda for the day: 1:47
Previous session recap: 6:03
Cost function: 6:25 7:47
Regression example: 7:20
Training data: 8:25 9:02
Overfitting: 9:13 10:30
Low bias and high variance: 11:45 19:17
Underfitting: 12:05
High bias and high variance: 13:45 19:30
Overfittting and underfitting scenarios: 18:20
Ridge and Lasso Regression situation: 22:00 22:30
Ridge Example: 25:38 29:50
Hyper parameters: 30:00
Lasso Regression: 32:44 36:00 (uses)
Feature selection: 35:20
Cross validation: 37:00
Quick summary: 37:33 38:37 (ridge) 39:40 (lasso) 40:16 (purpose of lasso)
Assumptions of Linear Regression: 46:30
-------------------------------------------------------------------------
Part II:
-------------------------------------------------------------------------
Logistic Regression: 47:35 48:10 50:00(scenario)
Why not Linear Regression? : 53:15 57:28
Squash: 59:00
Sigmoid function: 59:39 1:01:51
Assumptions: 1:02:44
Cost function: 1:09:38 1:15:00 1:16:15 1:19:20
Convex and Non-convex function: 1:10:45
Logistic regression algorithm: 1:22:00
Confusion Matrix: 1:29:50
Accuracy: 1:31:39
Imbalance dataset: 1:33:28
Precision and recall: 1:37:00 1:37:45 1:45:00
F score: 1:46:43 1:47:46(F 0.5 score) 1:48:38(F 2 score)
Thanks man
Thank you
Thanks man !
@@narendratiwari4238 welcome
Thanks
Thanks
00:27 The main topics of discussion are ridge and lasso regression, logistic regression, and the confusion matrix.
08:25 Overfitting and underfitting are two conditions that affect model accuracy.
22:28 L2 regularization adds a unique parameter or another sample value to minimize the cost function.
27:53 Ridge regularization is used to prevent overfitting by creating a generalized model.
39:15 Preventing overfitting and feature selection are the key purposes of ridge and lasso regression.
45:08 Logistic regression is a classification algorithm.
56:09 Logistic regression is used for binary classification problems with a decision boundary.
1:01:56 Logistic regression is used to create a sigmoid curve that helps in binary classification
1:13:03 Logistic regression cost function has specific equations for y=1 and y=0.
1:18:35 Logistic regression cost function and convergence algorithm
1:31:22 Calculation of basic accuracy and imbalanced data
1:37:06 The main aim of recall is to identify true positives.
1:48:48 F-score is calculated based on the value of beta
Crafted by Merlin AI.
Super explanation of Ridge regression. Fundamentally its to prevent overfitting. Because cost is getting non zero. Algorithm tries to optimize the slope value.
Ek teer do nishan
Prevent overfit and slope is optimized due to new line
Hi Krish,
Is the below steps was correct for regression problem.
1. In linear Regression Model first we will do EDA, Feature Engineering, Data Pre-processing and will split data into Train and Test.
2. Creating model using Linear Regression and evaluate the model like finding Loss and R2 Square.
3. If we could see more Loss then we have to do optimization using gradient decent and stochastic gradient decent for minimizing the Loss
4. Finally we have to check Bias and Variance trade-off if model getting overfitting then use L1 regularisation for preventing overfitting and L2 regularisation for preventing overfitting and feature selection as well.
Thanks,
L1 regularisation is the Lasso regression that performs feature selection, not L2.
Low Bias, High Variance (Overfitting): When a model has low bias and high variance, it means that the model is able to fit the training data very well (low bias), but it is overly sensitive to the specific training examples and may not generalize well to new, unseen data (high variance). Overfitting is characterized by capturing noise or random fluctuations in the training data.
To find an optimal model, there is a trade-off between bias and variance. The goal is to strike a balance that minimizes both bias and variance, leading to a model that generalizes well to new data. Techniques such as regularization and cross-validation are commonly used to address overfitting and find a suitable compromise between bias and variance.
@krish naik gone through multiple sites , and observing underfitting is High bias and low variance .
I think after your 7 days series on ML , DL, EDA, time series, we can participate in kaggle competition. This would be the most efficient way to learn data science ! Hope you can do the series for DL and EDA too !
Normal distribution of features is not an assumption of Linear Regression.
We want normal distribution to avoid overfitting by outliers.
@@ammar46 most relevant comment to what @minhaoling3056 said
Great work! Krish
thank you krish i am watching your ml algorithms again and again to make better
Normal distribution of features is not an assumption of Linear Regression.
We want normal distribution to avoid overfitting by outliers.
Pls make similar live videos or recorded videos in basics of time series forecasting explaining all the concepts.
High bias and low variance : For Underfitting : 14:26 min
ML 1 st session has 247K views.....But this 2 nd session has only 34K only. That is very bad. Peoples always loved to start anything. But after that they hate to continue those things. They didn't hold it. That's why peoples don't get that much of job offers and fail on interviews.
many thanks sir many thanks
Thank you sir.
thank you so much, this detailed structured videos are very helpful.
Thanks man ! god bless you
14:32 Correction.. Underfitting occurs if the model or algorithm shows low variance but high bias (to contrast the opposite, overfitting from high variance and low bias). I
If the model has high bias, how will it have low variance?
amazing lecture,, can you explain gzlm linkage function in details .. i feel talking abouyt range of y and mx+c after conversion will help
Well explained in simple way sir🙏
Thanks man
Sir,
Underfitting means High Bias and Low variance
awesome session.. thank you
When I read about Linear Regression, I always see mentioned Ordinary Least Square as the most used algorithm to find the thetas parameters. Why didn't Krish mention it? Is it not important? Can anyone explain?
awesome sir really i wanna say thanks for this information in crisp manner thanks so much
very comprehensive and amazing teaching sir. I can't thank you enough
Lovely one..
now I need a pepto bismol after looking at the eqns
Superb explanation sir wonderful 😊
41:52 Assumptions of LR
You are the Guru........🙏🙏🙏🙏🙏
#KingKrish
Excellent video
Please Cover Coding along with tutorial
sir, you are great.
finished watching
Please don't confuse learners, model should follow normal distribution is wrong. It is "Residuals should have normal distrbution". In linear regression errors are assumed to follow normal a normal distribution with a mean of zero.
when high Bias and High variance then predictions will be inconsistent and not accurate, Low bias and Low variance is an Ideal Model always..
Low Bias High Variance: Over fitting
High Bias Low Variance :Under fitting
High bias High Variance: Underfitting. If the model performs poorly on train data, how will it perform good on test data? Clearly the model will not be able to generalise well.
@Krish Naik Sir i am not able to find this content uploaded in mega community course. Please let me know how can i get these slides.
Hi Krish ,please explain how slopes becomes 0 in case of Lasso
I have a doubt that he mentioned that lasso will do feature selection and ridge can't. The explanation he had given on that in ridge while squaring the slope it will increase but not in lasso...
My doubt is if the feature is not important then its slope will be less than One. Then its square will again going to be so small...Its not going to increase... Then how slope ridge is not ineffective to feature selection...It should give more better result than lasso in that case...
Most important part 1:29:00
Hi Krish, you have taught much better than Sudhansu.
If we square the less significant coefficients then it would be much better as the square value would reduce it further then according to this particular scenario ridge is better right
#Thanks Sir
just have a little doubt here :, 41:00 WHY WE DIDNT DIVIDE THE COST FUNCTION BY 2m?
1:10:01 ,Do we get convex function because of cost function or Becuase of sigmoid
there was a small mistake in the explanation for lasso or L1 regression we are suppose to sum the mod of the slope not the mod of sum of slopes. both are different
in video you wrote | theta0 + theta1 + theta2 + theta3 + theta4 + ... + theta_n |
but in actual the L1 norm should be |theta0|+|theta1|+ |theta2| + |theta3| + ...+ |theta_n|
hope u get my point
Thank you
Thanks
Thank you
Also there shouldn't be a 1/2 factor in logistic regression cost function. 1:22:35
bro please explain in terms of vectors and getting solutions of this eqs in vector
Great
Please give an example of Lasso Regression
Under fitting means high bias and low variance. Please correct it
Sir very very nice sir
In logistic Regression , Our dependent feature may depend on multiple independent features at that time how can I deal with this???Thank you
1:02 ,what is g(z) here Krish ,is it predicted variable y
1:41:51 / 1:52:40
There is big myth that normality assumption is for dependent feature
But reality is
Normality assumption is for residual (error) not for features
Because if residual follow normal then its sum follow chisqure and then and then only ratio of msr/mse will follow f distribution
Can u post a video on cooks distance and leverage
Plz Update the study materials.
linear regression
Overfitting: Good performance on the training data, poor generliazation to other data (low bias but high variance).
Underfitting: Poor performance on the training data and poor generalization to other data ( high bias and high variance).
Please give the link for the notebook
Does anybody have the materials for these live sessions? I tried to find them on the link that's provided but that isn't working.
where do i find all the materials related to this 7 days program?
Hi Krish, Are the materials available even now ? How do I download ?
Have you downloaded the material/resources.
Can I know about live projects when it is starting????
Great session! some1 please help. I am unable to download material
Please arrange a coding session for mL
sir, notes are not available in given link. it seems invalid link. Please provide it for practice.
Hi Krish, I am not able to get into community forum to get this pdf file which you have written during the course.
Are the documents removed from community forum.
did you got the pdf, i too am unable to get it
Overfiting and underfiting use
anybody has notes of this course, would be very helpful if someone can share them, or tell where to access them.
Anyone Can you please post the Notes over here. I'm unable to open the link. As it got expired.
Hy sir, my dataset containing 297 features and 9 types of prediction and results with Logistic regressions are low, why
is it not a binary formate outcome so results are poor????
In spam classification why we use precision
Are these for freshers ....?
Notes are not available on community
can someone share the PDF of this series
Assumption of linear regression
Linearity
normality of error
Independence of error
No autocorrelation
Homoscedasticity residual variance equal and mean of residual equal to 0
True, Normal distribution of features is not an assumption of Linear Regression.
We want normal distribution to avoid overfitting by outliers.
Sir where can i get this PDF.
Just published by @Krish Naik, new video describing Lasso and ElasticNet:
th-cam.com/video/qbJKrlOxlJA/w-d-xo.html
- with helpful numerical examples of how feature selection works in Lasso.
I complete my boards Can i join is it relevant to me?
i am unable to get the material
the notes link is not working
Hi guys, asking this for a requirement I’m working on, how to reduce the false positives in my model? I’m getting 1700 positive predictions out of which the actual positives is 46. It would be great if someone help me. Thanks in advance!
Reduce the threshold or cutt off criteria for example, if probability is greater than .5 then y=1. Change it to .4 then .3.
This will reduce your FP's but these will be rearranged somewhere mostly into FN's.
Where are these notes
Sir, if Logistic Regression is Classification problem then why it is called logistic regression and not logistic classification ???
Bcoz eventually it's predicting the probability of the dependent variable for a particular class, and hence the output is a continuous variable. Thus it's called Logistic regression
@@rupalacharyya4606 thanks, I also had the same confusion....but now it's clear with your explanation 👍
Bro ignore the name focus on the game 😉
I see no reason the (h0(x) - y)^2 for logistic regression is non-convex. 🧐
Sir, please update the phone numbers and the links in the description
Sir hindi main bhi bata sakte ho kya
Already uploaded in Krish Hindi channel
I dont understand how underfitting = High bias and High variance
Please someone give me link to read about it
Underfitting - high bias
Overfitting - high Variance
Bias relates to training data accuracy and Variance relates to testing data accuracy
so when we get low accuracy on training data we get High Bias means the data is not fitted correctly
similarly when we get low accuracy on testing data we get high variance which means the prediction is not accurate
Hope the explanation helps..
Please teach on white screen.
why you are making most of the videos as members only content which were free before. is it a greed for money now?
thank you, sir,
Thanks