Machine Learning Tutorial Python - 8 Logistic Regression (Multiclass Classification)
ฝัง
- เผยแพร่เมื่อ 22 มิ.ย. 2024
- Logistic regression is used for classification problems in machine learning. This tutorial will show you how to use sklearn logisticregression class to solve multiclass classification problem to predict hand written digit. We will use sklearn load_digits to load readily available dataset from sklearn library and train our classifier using that information.
#MachineLearning #PythonMachineLearning #MachineLearningTutorial #Python #PythonTutorial #PythonTraining #MachineLearningCource #LogisticRegression #sklearntutorials #scikitlearntutorials
Code: github.com/codebasics/py/blob...
Exercise: Open above notebook from github and go to the end.
Topics that are covered in this Video:
0:00 - Theory (Binary classification vs multiclass classification)
0:26 - How to identify hand written digits?
1:02 - Coding (Solve a problem of hand written digit recognition)
11:24 - Confusion Matrix (sklearn confusion_matrix)
12:42 - Plot confusion matrix using seaborn library
14:00 - Exercise (Use sklearn iris dataset to predict flower type based on different features using logistic regression)
Do you want to learn technology from me? Check codebasics.io/?... for my affordable video courses.
Next Video:
Machine Learning Tutorial Python - 9 Decision Tree: • Machine Learning Tutor...
Populor Playlist:
Data Science Full Course: • Data Science Full Cour...
Data Science Project: • Machine Learning & Dat...
Machine learning tutorials: • Machine Learning Tutor...
Pandas: • Python Pandas Tutorial...
matplotlib: • Matplotlib Tutorial 1 ...
Python: • Why Should You Learn P...
Jupyter Notebook: • What is Jupyter Notebo...
Tools and Libraries:
Scikit learn tutorials
Sklearn tutorials
Machine learning with scikit learn tutorials
Machine learning with sklearn tutorials
To download csv and code for all tutorials: go to github.com/codebasics/py, click on a green button to clone or download the entire repository and then go to relevant folder to get access to that specific file.
🌎 My Website For Video Courses: codebasics.io/?...
Need help building software or data analytics and AI solutions? My company www.atliq.com/ can help. Click on the Contact button on that website.
#️⃣ Social Media #️⃣
🔗 Discord: / discord
📸 Dhaval's Personal Instagram: / dhavalsays
📸 Instagram: / codebasicshub
🔊 Facebook: / codebasicshub
📝 Linkedin (Personal): / dhavalsays
📝 Linkedin (Codebasics): / codebasics
📱 Twitter: / codebasicshub
🔗 Patreon: www.patreon.com/codebasics?fa...
Do you want to learn technology from me? Check codebasics.io/ for my affordable video courses
There are very few teachers who actually make us fall in love with learning. You have an incredibly fascinating way of teaching Sir!!
😊👍
I do not usually comment but you wrote the code so simple and explained so beautifully that i had to praise you. Thank you so much !!
Complete machine learning tutorial playlist: th-cam.com/video/gmvvaobm7eQ/w-d-xo.html
I solved the exercise and my model got an accuracy of 96.67%
Thanks for making such great videos.
@@anujvyas9493 can you please send the solution..i also got same accuracy but unable to do prediction
@@sonalgarg5628 Sure! Email ID ?
@@anujvyas9493 sonal.garg@gla.ac.in
@@sonalgarg5628 Sent it to you! Sorry for the late reply.
Thanks for your teaching! I like your tutorials and exercises, that make me quickly understand.
Great tutorial, thanks a ton for shaing this amazing stuff. Request you to start a series on NLP, Deep Learning or Text Analytics
As always great video. Greetings from Brazil!
I got accuracy 93% for iris data set. Thank you very much to make ML simple.
Sir , Whatever you teach it's very very interesting and I think I am luckiest person which I am reading from your videos
It's very helpful for us and you are great.
I have seen many videos but no one teaches like you.
Thank you. I wish I had discovered your channel 6 months ago. I could have saved so much time.
Within 2 Days I have addicted to this channel......I am on this Channel for around 5-6 hours Continuously....... Please Continue the Series......Thanks
Thank You. After watching previous 8 videos, I tried this Iris exercise on my own and my model actually predicted so well, with a score of 1.0
it is overfitting bro
@@satyazigyansu6873 accuracy is varies with random state and test size
random state = 42 and test size = 0.2 then accuracy = 100%
random state = None and test size = 0.3 then accuracy is around 97% and it varies every time
@@satyazigyansu6873 no brother it depends on dataset whether it is testing or training. If it is on testing dataset then it is not overfitting, if it is is on training dataset then it is overfitting.
Thanks sir. Simply you are great for such type of free courses.Nice service to society
I CAN'T SAY THIS ENOUGH - THANK YOU!
Thank you for these awesome tutorials. Please upload next tutorials.
Nice tutorial, I have forked your project PY .THX
The contents are actually very engaging and helps u tolearn complex topics very easily
probably the best tutorial series for beginner thank you!!!!!
Glad it was helpful!
Very clear, thank you!
you are a great teacher....
thank u for this series
Your explanation is at a different level. Just one request please add the different machine learning algorithms a bit fast as once someone starts leading from your channel gets hooked up to it ...
Got 96.66% accuracy.....while practicing on your given iris.csv dataset...I am new on your channel, but got addicted to your videos, especially to the playlist of machine learning... please keep teaching us in same way. Thanks a lot..
That’s the way to go Kashif, good job working on that exercise
Bro we need to download exercise from kaggle? As sir only uploaded image on github
@@codebasics accuracy is varies with random state and test size
random state = 42 and test size = 0.2 then accuracy = 100%
random state = None and test size = 0.3 then accuracy is around 97% and it varies every time
A little detail... after updating sklearn to version 0.20.2 or higher it will be needed to specify a solver and multi_class specification as parameters to avoid warning errors. For instance "model = LogisticRegression(solver = "newton-cg", multi_class="auto")"
Thank you very much. You just saved me a big headache. I had the warning and came looking to the comments for help. Great job.
@@russnagel1 Happy to see that the comment is helping somebody. You made my day.
Very helpful, I tried using max_iter / n_iter to 200, in the model.fit() part, but that didn't work either.. eventually, it's your suggestion that did work!
my savior
u can also use standard scaler
Thank you so much....liked and subscribed.
Awesome. Thanks for sharing. I love the way you teach topics. So easy to understand. Thanks again.
Yup nitin, things don't have to be taught in a hard way.. there is always an easy way to explain the concepts :)
One of the best tutorial... Thankyou so much...It is very helpful and informative.... I wish to see more videos on other topics...
Glad, you liked it.
Respect and appreciation from 🇵🇰 . Interesting teaching skill. 👍
thanks for good tutorials
Another awesome video! Thank you
finalllllly I understood how to interpret confusion matrix for multiclass classification thankyou!!!!
I got accuracy of 96.66%.
Thank you so much for your initiative. Best part of your playlist is exercises that give confidence and a clarity how to apply logics in form of code. And best part you talk about practical use cases.
accuracy is varies with random state and test size
random state = 42 and test size = 0.2 then accuracy = 100%
random state = None and test size = 0.3 then accuracy is around 97% and it varies every time
for best way choose random state = 42 or 10
simply amazing
Thanks your tutorials are very clear and intutive and easy to understand.
Rakesh, thanks for your kind words of appreciation
On my way to watch your whole playlist. You are a great techer! I got accuracy 95.6%
👍😊 wish you all the best
Can you share your solution ?
What parameters did you use for the LogisticRegression model?
Thank you so much for this invaluable series
Glad you enjoy it!
@@codebasics
Kindly make a video on confusion matrix multiclass classification please 🙏
Thanks a lot, very clear
really appreciate your hard work. from your videos it was super easy to learn the concept . thank you
You are most welcome
Iris dataset -> 97.777777777777 accuracy with test_size =0.3
I have fallen in love with this amazing knowledge 🤩.Thanks a lot Sir ❤️.
I got an accuracy of 1 with test_size=0.2.
Bro we need to download exercise from kaggle? As sir only uploaded image on github
I got 96.66% accuracy for Iris dataset exercise. Great work! Thoroughly enjoying and learning a lot from your courses.
i got 94.73%
does it vary? or have I done any mistakes?
I got 100.0%
@@digvijaymahamuni7722 this is due to a very small dataset
@@fazalahmad1546 check for overfitting
Hey guys chill it isnt like you guys working in backend developing library. also it is relatively clean dataset already.
Thank you and practice exercises are useful as well
Glad you liked the exercises Vishnu
Amazing content you make it all seem easy
Glad you liked it Mohammed.
I got 96.66 accuracy. Thanks.
Waiting for your next videos. Hope you will upload soon.
Good approach for coding the basic machine learning . Carry on
👍😊🙏
Very helpful, thanks!
Glad it was helpful!
got accuracy of 93.34%. Thanku very much really addicted to your videos.
Well done
Great job, Thank you ver much
Glad you liked it!
Excellent explanation
at 7:50 , use this >> model = LogisticRegression(solver='lbfgs',class_weight='balanced', max_iter=10000) to avoid this warning >>> 'ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.'
Thank You Mabel Karani , you helped me !
thanks 😊
thks
@@muhammadtayyab2148 what was ur accuracy score using the above parameters?
Thank you madam
nice tutorila.... by watching your tutorials lot of people are opeining institutes in Hyderabad
Ha ha.. are you serious? 😊
Sir,
Please make videos on other topics of machine learning like k nearest neighbour , support vector machines. Your videos are very very helpful. please continue this series.🙏🙏
You can refer to videos of sentdex. The videos are much better including k nearest neighbor. th-cam.com/video/OGxgnH8y2NM/w-d-xo.html
that was awesome🤩🤩
Thank you so much sir :)
I loved the tutorial! , got an accuracy of 97.72 %
Great job!
how?.....I'm getting only 56%
Sir your way of describing things is very easy to grab and understand. Thank you for the tutorial. I request you to please also make a few videos of analyzing data (statistics) before using it into a model. Like variable correlation, and what variable should be used and which one should be dropped, etc.
point noted kuldeep and thanks for your appreciation. I want to add lot more content but unfortunately facing health troubles. once i recover I will be back with full force :)
I loved this tutorial..! Absolutely awesome...!! i get up to efficiency= 96.6%
That’s the way to go Harsh, good job working on that exercise
awesome!
Is there no Exercise solution?
Amazing lecture! I got an accuracy of 93.3%
i got 88% score
Wow, Your videos are amazing!
And i got an accuracy of 96%
great. thanks for working on exercise and congrats on getting such a high accuracy score. Good job :)
Sir can u send me that code please... I am not getting that so...
@@vinays.m6831 PFB code. Please let me know if anything is incorrect.
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
iris = load_iris()
x_train, x_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)
irisModel = LogisticRegression()
irisModel.fit(x_train, y_train)
targetIndex = irisModel.predict(x_test)
for i in range(len(targetIndex)):
print(iris.target_names[targetIndex[i]])
irisModel.score(x_test,y_test)
Thanks so much for these great tutorials! I wish you would upload the continuation of this playlist faster so we can learn fast.
@@codebasics Wow, I admire the fact that you're able to make these videos despite your busy schedule. Keep it up!
@@asedaaddai-deseh8152 medium.com/trainyourbrain/would-you-read-this-article-or-not-b757d0e26cf8
got 93.33% accuracy. Thank u so much for this playlist..
i also got 93.33% accuracy can you please tell me how you did it I want to cross check my procedure.
Loving your Lectures sir.
Could you please use any best deep learning model for this dataset.
Or Suggest me one. :)
Good video.
my model is 100 percent accurate for iris dataset. thanks for teaching all the topics which are really important in a clean and clear way.
please do tutorials on Computer vision using Tensorflow
That would be awsome!! ♥️
I got 100% accuracy for the iris exercise. Sir give more exercise. These are very helpful, thanks a lot sir
didn't you get total no. of iterations reached ??
Can you help me out ??
increase the size of your test data and then check
@@shreyjoshi18 okay .. thanks
awesome
thank you
Thank you so much sir !I am so so grateful to you for these wonderful tutorials ,hope i can learn even more and faster.Btw i got my accuracy as 97.77 !
Bro I got the same but is it correct? How can accuracy be so high? Please can you explain
@@pranav9339 because the trends are very similar in the test set data too ig and the variance is also low ...that's the reason i think
Awesome exercise! I got an accuracy of 97, 77%
Can you please provide the solution link as it is not there on github? It would be helpful.
Hi, @@aditinagar6688. Please see below:
iris = load_iris()
print(dir(iris))
df = pd.DataFrame(iris.data, columns=iris.feature_names)
print(df.head())
df["target"] = iris.target
print(df.head())
df["target"].replace({0: "setosa", 1: "versicolor", 2: "virginica"}, inplace=True)
print(df.head(-10))
x = df.drop(["target"], axis=1)
y = df["target"]
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3)
print(len(x_train))
print(len(x_test))
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model.fit(x_train, y_train)
print(model.score(x_test, y_test))
print(model.predict(x_test))
#print(y_test)
print(model.predict([[4.9, 3.0, 1.4, 0.2]])) #setosa
y_predicted = model.predict(x_test)
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_predicted)
print(cm)
import seaborn as sn
plt.figure(figsize=(10, 7))
sn.heatmap(cm, annot=True)
plt.xlabel("Predicted")
plt.ylabel("Truth")
plt.show()
@@cindinishimoto9528 thank you so much..!
this helps a lot.. i was not able to figure out how to handle that dataset!
@@tejobhiru1092 ^_^
@@cindinishimoto9528 i also need this exercise code very badly
great
nice explanation. I have one question. What about if we have mix of dependent variables data, like binary as well multiclass v variables, is it fine we apply multiclass regression?
At 15:33 I thought you are going to say 'plz plz subscribe the channel, like, comment, share... :) Thank you sir for making such a great videos...
Can't express how grateful i am to you Sir.
I am very willing to even pay for your stuff and help you somehow.
Thanks once again, my accuracy was about 92%
how much ratio test/train u use? i got 91...%
@@zerostudy7508 20% of the data to be tested. But the accuracy depends as we are getting random data to be trained or tested. My opinion is that your model is correct, we just have different trained data.
@@leooel4650 Thank you so much buddy, i just checked that if i use 90% data for train and 10% data test i get 88-90% acuracy, but when i use 80% the data for training i got everage more than 90-100% accuracy. i'll tell you when i figured something out....
@@zerostudy7508 happy to help as I am still figuring things out.
@@leooel4650 i got it
it something to do with sample and population'
if test A=20 and test A=10
then they both got just 1 wrong answer
A and B Standard Deviation Sample are
A=0.217944947
B=0.316227766
about 10% difference
in a nutshell its sound like this:
your teacher give 10 questions for exam and your friend got 100, if both of you had 1 wrong answered in the exam, which of you have the highest test score ?
have a nice day
Superb content, liked this very much
12:50, maybe there's a simple mistake that xlabel should be Truth while ylabel should be Predicted, as we have defined cm in that way
getting a score of 1.0, by using newton-cg solver. Default LogisticRegression() shows warning. You can use model = LogisticRegression(solver = 'newton-cg', multi_class='auto') for better training and accuracy.
Amazing tutorial:) How to make roc_curve for this multiclass problem?
How to recognize whether the classification dataset is linear or non linear if there are multiple variables and cannot be plotted?
sir I have done the Iris flower exercise according to what I have learnt from you. I got an accuracy of 1.0 (I thing it is 100%) !
I just done everything according to what I have learnt from you!
Perfect and keep it up. The dataset is small hence getting accuracy of 1 is not unusual
if you have given random_state or shuffle=True then the accuracy will be 1
@@vedanthbaliga7686 even without a random_state or shuffle it s still possible to get 1, it s all due to the fact that our dataset here is small
How to visualize decision boundry through plot and how to optimize using log loss function, and whatever you are teaching that teaching everyone.
Sir done with the assignment. Got 100% train accuracy for iris dataset and also plotted the confusion matrix.
accuracy=93.3%
thankyou sir
Thank you for explaining this in such a nice and easy way. BTW, I downloaded the whole GIT files but could not find the exercise solution for this session, so If some one has a clue please let me know.
Yaa the answer for this exercise is not in the file. I solved the exercise, you can also try in the same way as in the model problem. but in the Handwritten digit problem, i got an error when fitting the model :( , i cant correct the error. It showing 'str' object has no attribute 'decode'. Can you help me to come out from this.
Bro we need to download exercise from kaggle? As sir only uploaded image on github
I got 100% accuracy🎉🤩
Hi ,as you said sigmoid function will convert values to 0 or 1 ,how is it possible to predict digits with this concept ?,for binary output I got it but for digits it confused me
do you need preprocessing for scaling data?
sir in this video i think you took x and y axis reverse in labelling the cause in confusion matrix arguments its x and y respectively right?
At 12.17 what we predicted was for X_test. Why did we compare the Y_test and X_predictions? Am i understanding it wrong?
😀
Your all video on any topic have deep theoretical explained with notebook , Can you suggest good resource or book for Machine Learning ?
th-cam.com/video/OGxgnH8y2NM/w-d-xo.html
100% accuracy.
Thank You Sir.
can you cover multicolinearity check in logistic regression
so it can only take inputs and predict images from the dataset?, how if i want to predict other images that are not from the digit dataset?
@12:47 maybe not that important.. but just for my clarification, I would like to confirm... should plt.xlabel not be 'Truth' and plt.ylabel be 'Predicted' ? Thank you for your hard work.
Yes, it is. It has produced merely transpose.
even I've the same doubt
Very informative. As per my understanding LR model predicts the binary classification problem. It would be great if you can share how this predicts this multi class problem?
Check machine learning tutorial playlist on my channel. I have example for binary classification as well and in fact this particular tutorial is for multiclass classification
I have a question please. Once you have built the model then how do you then use it to show your company how to turn target certain customers for better results
How can use this model used to recognize a new target image out of digits library?how to view the classification graph?
Sir plz make videos on feature selection and engineering
Love ur videos! , but how is this example muticlass we are just using target and data. Thanks
Dear Sir
Very interesting exercise.
Model accuracy varies from 0.8 to 1.0, each and every time after a fresh run of the full code (as you explained). The average accuracy is around 9.66667.
Thank you very much
use (shuffle=False) in train_test_split()
Hi, I had a query. In the part where you plotted the confusion matrix, shouldn't the xlabel be Truth and the ylabel be Predicted since in the confusion matrix we used y_test as x and y_predicted as y ?
Great videos btw, really helpful xD
good point
Yes I am also thinking the same.