Loan Prediction Analysis (Classification) | Machine Learning | Python

Hackers Realm

มุมมอง 52 535

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 10 ม.ค. 2025

ความคิดเห็น • 178

@HackersRealm 4 ปีที่แล้ว ⁺⁶
Small correction viewers, I mentioned distribution of left and right skew graph in opposite manner. To avoid error while converting to log values add +1 to the column. I have updated the notebook in the github. Enjoy the rest of the video!!!
@vamsichittoor1974 3 ปีที่แล้ว ⁺¹⁰
I have just started learning Machine Learning and I understood every bit you explained and done one project on my own similar to this .Really great explanation. I would like to know how to master Machine Learning. I am not student of CSE I am learning this on my own interest
@HackersRealm 3 ปีที่แล้ว ⁺¹
Glad it was helpful!!! kudos to you learning with your own interest. Try to pick a mini project in some domain and solve it. That's a quick way to understand and learn...
@Umeshwarvaithy 4 ปีที่แล้ว ⁺²
Super bro while started learning ML I found your channel and started my learning and progress doing the project thanks for your interest and effort
@HackersRealm 4 ปีที่แล้ว
Hope the videos are useful to you!!! Thanks for watching and please share it for better reach. Thank you!!!
@NafisAnsari-vr2xq 2 หลายเดือนก่อน ⁺¹
This is awesome man, just went through your blog, the amount of efforts put is amazing. Thanks for the project explanation 🙌
@HackersRealm 2 หลายเดือนก่อน ⁺¹
Thanks for your kind words!!!
@chiragparmar3678 3 ปีที่แล้ว ⁺¹
Bro u explained much much better than edureka I swear bro thanks!
@HackersRealm 3 ปีที่แล้ว
Thanks for your kind words!!!
@gouthamkarakavalasa4267 3 ปีที่แล้ว ⁺⁴
Bro, it looks like at 17:08, u applied logit for coapplicant income, but u viz graph for applicant income, ... In the co applicant income, logit function is throwing a error as it contains zeroes.. Request to pls advice on this issue.
@HackersRealm 3 ปีที่แล้ว
you can add +1 to the data column, it will resolve the issue
@mitali3j 3 ปีที่แล้ว
At which level does 1 needs to be added?
@HackersRealm 3 ปีที่แล้ว
@@mitali3j you can add 1 when you see some 0 values, or you can use it generally, there won't be much change in log values
@MrJeffoneal 2 ปีที่แล้ว ⁺¹
Thank you! Very insightful and thorough explanations.
@HackersRealm 2 ปีที่แล้ว
Glad you liked it 😀
@michaelk765 3 ปีที่แล้ว ⁺¹
Great explanation of your model building. Thank you!
@HackersRealm 3 ปีที่แล้ว
Glad you liked it!!!
@Kalyan1143 11 หลายเดือนก่อน ⁺¹
Finally the final output is wt?
I mean loan eligible yes or no?
@HackersRealm 11 หลายเดือนก่อน
for the test data, we are predicting from the model and calculating the score of how well it's predicting
@anuragupadhayay8405 ปีที่แล้ว ⁺²
ValueError: could not convert string to float: 'Male' WHEN I AM USING THE COUNTPLOT IT KEEP SHOWING THIS
@nitisht4040 ปีที่แล้ว ⁺¹
bro did you get solution, if yes please help me out
@MEGAMINDLIVE 4 ปีที่แล้ว ⁺¹
14:38 you are saying distribution is left skewed but its right skewed.
@HackersRealm 4 ปีที่แล้ว
Sorry, I mispronounced the skewed data
@anilsailakhinana94 3 ปีที่แล้ว ⁺¹
I'm subscribed ur channel for this clear explanation 👍 it was so helpful
@HackersRealm 3 ปีที่แล้ว ⁺¹
Thanks for your kind words!!!
@avishkaravishkar1451 3 ปีที่แล้ว ⁺¹
Excellent video, found it very helpful!
@HackersRealm 3 ปีที่แล้ว
Glad it was helpful!!!
@mokshsharma6943 2 ปีที่แล้ว ⁺¹
in Explanatory data analysis section of video, how to use for loop for sns.countplot() ?
@HackersRealm 2 ปีที่แล้ว
You can store it in a variable and use the subplot to show multiple shots
@sidharth_mohanty 4 ปีที่แล้ว ⁺²
I m unable to apply correction matrix on categorical data before label encoding.
How did you do that ?
@HackersRealm 4 ปีที่แล้ว ⁺¹
correlation matrix can be calculated with numbers only, not with strings.
@muhammedfaizals4427 2 ปีที่แล้ว ⁺¹
for i in ['LoanAmount','Loan_Amount_Term','Credit_History']:
tr_data[i] = tr_data[i].fillna(tr_data[i].mean())
we can use this instead filling everything seperately
@HackersRealm 2 ปีที่แล้ว
yes, we could do that!!!
@shellm1447 3 ปีที่แล้ว ⁺¹
Amazing explanation
@HackersRealm 3 ปีที่แล้ว
Glad it was helpful!!!
@brit_indi1930 2 ปีที่แล้ว ⁺¹
U JUST EARNED THE SUB
@HackersRealm 2 ปีที่แล้ว ⁺¹
Thanks man!!!
@pkmisra769 3 ปีที่แล้ว ⁺¹
Very nice video. Best thing is your response to people's queries (unlike others). Great Job. I have 1 suggestion. If you could also cover how to deploy this model somewhere (with fresh data coming in and how model throws output). That would be amazing. Thanks.
@HackersRealm 3 ปีที่แล้ว ⁺¹
Thank you very much. In this video, I have explained the process for deployment th-cam.com/video/2LqrfEzuIMk/w-d-xo.html
@abhiavasthi624 3 ปีที่แล้ว ⁺³
i have seen that you respond to comments so i would just like to ask you,
what changes do i have to make if my training and testing dataset are in different files already?
for example in a kaggle project where the training and testing data are in different files, what changes in the code will i have to make?
@HackersRealm 3 ปีที่แล้ว
For training, don't split the data, train with the whole data. After that preprocess the test data similar to train and try to predict it. You can also see the video for how to predict test data in the playlist
@abhiavasthi624 3 ปีที่แล้ว ⁺¹
@@HackersRealm thanks so much man, respect your timely response.
what i did is i skipped split part and simply preprocessed the test data as well and the used
y_test = model.predict(x_test)
for the prediction
but for this case we can't check the score and all right?
since i didn't see the loan_status column in the test data.
@HackersRealm 3 ปีที่แล้ว
@@abhiavasthi624 yes, that's right, you only get the output results
@vedgadge8659 2 ปีที่แล้ว
At 40:36 dependents is already in numeric form why does it require label encoding?
@HackersRealm 2 ปีที่แล้ว ⁺¹
yes, we don't need to include that
@vedgadge8659 2 ปีที่แล้ว
@@HackersRealm hey man I tried that but if we don't include dependents it gives and error while classifying. It is the same error as in the video
ValueError: could not convert string to float:'3+'. I'm not understanding this
@HackersRealm 2 ปีที่แล้ว
@@vedgadge8659 Oh yeah, i forgot that, it represents as string, that's y i used label encoder. but you can remove that + and convert that string to integer
@vedgadge8659 2 ปีที่แล้ว ⁺¹
@@HackersRealm okay sure I'll try thanks man
@rakeshnargund570 2 ปีที่แล้ว
Hi.. well explained. i have one question ...... why you did not drop "ApplicantIncome" even though you combined with "CoapplicantIncome" and created "Totalincome"...??
@akshaykrishnan7985 3 ปีที่แล้ว ⁺¹
Hi Ashwin. Could you please upload videos on model deployment with flask using heroku?
@HackersRealm 3 ปีที่แล้ว ⁺¹
Hello, deployment of models, I will cover in later videos for sure, now just covering the basic concepts for better understanding!!!
@akshaykrishnan7985 3 ปีที่แล้ว ⁺¹
Thanks a lot 😊
@SanyAnnieJohn 4 ปีที่แล้ว ⁺¹
Hi Sir, Logistic regression gave the best score, then why chose Random forest for hypertuning?
@HackersRealm 4 ปีที่แล้ว ⁺²
for example purpose only
@dr.mahaboobbasha1074 ปีที่แล้ว
Sir..we normalised data of income of applicants and coapplicant and where it is impacting on analysis
@HackersRealm ปีที่แล้ว
It will impact on the model training and testing... but those comparison is not covered in the video
@kartiksolanki9390 2 ปีที่แล้ว ⁺²
Very helpful
@HackersRealm 2 ปีที่แล้ว
Glad it was helpful to you!!!
@sodiqrafiu9072 4 ปีที่แล้ว ⁺²
Please, come up with more projects
@HackersRealm 4 ปีที่แล้ว ⁺¹
working on it
@funnybunnies3985 3 ปีที่แล้ว ⁺¹
why are you using log transformation? you can normalise the data?
@HackersRealm 3 ปีที่แล้ว
you can use any preprocessing approach. It's no issue, try to test & see how it works
@afreen2806 2 ปีที่แล้ว
except for logistic regression, all other models accuracy and cross-validation is changing if I run it more than once. Can u explain y?
@HackersRealm 2 ปีที่แล้ว
you can set random state inorder to get same results for rerunning
@rameshkannan1075 3 ปีที่แล้ว
Can u explain the credit history in data mentioned 0 and 1. Can u post video or tutorial link how cibil data are analysed to get credit history values
@HackersRealm 3 ปีที่แล้ว
If the person has credit history, it's 1 or else its 0. I will try analysing cibil data if possible
@rameshkannan1075 3 ปีที่แล้ว
@@HackersRealm I need to know there will be n no of customers. These customers cibil how to extract to single excel file. Then based on past repayment we can decide the probability of default.
@shellm1447 3 ปีที่แล้ว
Have you also covered hmeq dataset for loan default prediction
@HackersRealm 3 ปีที่แล้ว
No not yet!!!
@nathanthadmalla9268 3 ปีที่แล้ว
where can v get the main dataset the link isleading to only the train and testing dataset where can the get the first dataset tha u have entered in your video
@HackersRealm 3 ปีที่แล้ว
that is the train data. you can use that
@diff008 2 ปีที่แล้ว
while plotting countplot keep Value Error: getting could not convert to float " Any idea why . Data set was downloaded from your kaggle link. No changes ( although looks like the file names have now changed.)
@HackersRealm 2 ปีที่แล้ว
try to check the values you're plotting, that may be the issue.
@SanyAnnieJohn 4 ปีที่แล้ว ⁺¹
Hi Sir, When I am plotting for Gender, why my x axis not giving the labels, as Male and Female. Instead it is displaying 0 and 1
@HackersRealm 4 ปีที่แล้ว
If you have done some transformation on that column, it will show like that
@SanyAnnieJohn 4 ปีที่แล้ว ⁺¹
@@HackersRealm Thanku, got it....
@sakshituteja3841 4 ปีที่แล้ว
This video is a great explanation of this project. I have just one doubt. From where I took the data set, Test data has a separate file of around 350 observations. How do I make use of that ?
@HackersRealm 4 ปีที่แล้ว ⁺¹
Glad you liked this video!!! You can use the test data to predict the output and submit it, if there is a competition. For practice, there won't be much use to it.
@PravinKumar-zc2eq 4 ปีที่แล้ว
@@HackersRealm how to do it??
@snehacookie4138 3 ปีที่แล้ว ⁺¹
Is this project can be done for final year project is this good topic to do
@HackersRealm 3 ปีที่แล้ว
yeah many people have done this as final year project
@snehacookie4138 3 ปีที่แล้ว
@@HackersRealm tq u
Like this itself we can present ryt
@HackersRealm 3 ปีที่แล้ว ⁺¹
@@snehacookie4138 yes
@snehacookie4138 3 ปีที่แล้ว
@@HackersRealm bro is this project good for jobs when u put in resume is this good for getting selected in a company pls say bro
@HackersRealm 3 ปีที่แล้ว
@@snehacookie4138 Well that completely depends on the recruiter, but students said they used for resume
@DhirajKrGupta-ke7xn 3 ปีที่แล้ว
What tech skills you learnt from the project
• Why did you pick that domain?
• Where can we use your tech skills / software’s learnt during project
• Reason for working on that project
Sir Please Help me for Interview preparation
@saisudhir5005 4 ปีที่แล้ว ⁺²
How to increase accuracy?
@HackersRealm 4 ปีที่แล้ว
using different models, hyperparameter tuning, etc., watch other projects of mine to learn more techniques
@rodsdesignestudio ปีที่แล้ว
hi, thanks for the vids but i want ask: why u did use LabelEncoder to the input values (['Gender',"Married","Education",'Self_Employed',"Property_Area","Loan_Status","Dependents"])? thx
@HackersRealm ปีที่แล้ว
we have to convert string to numeric values so model can accept the input. label encoder is one of the technique
@afserali450 ปีที่แล้ว
@@HackersRealm how to convert male in gender column to float
@HackersRealm ปีที่แล้ว
@@afserali450 In video, I used label encoder or one hot encoder to do that.. You can use whichever method that is feasible
@LoneWolfff07 4 ปีที่แล้ว ⁺¹
bro how can i get accuracy more than 80.42
which algorithm should i use
@HackersRealm 4 ปีที่แล้ว ⁺¹
It depends on every factor, not only algorithm, Check out other projects in the tutorial series, so you can get additional insights on increasing accuracy.
@snehamagadum1342 3 ปีที่แล้ว
Sor I did not get the conclusion of this project, After the heat map , How can we tell the loan is approved or not?
@HackersRealm 3 ปีที่แล้ว
the model training and results, section you're asking?
@SovannLy-h3s ปีที่แล้ว
Hello Sir
I followed your codes, arrival at section ' Exploratory Data'. I replaced the missing values ' df['Gender']=df['Gender'].fillna(df['Gender'].mode()[0])
the line of codes below
sns.countplot(df['Gender'])
the result
ValueError: could not convert string to float: 'Male'
could you please advise me, to correct the codes.
Thank you
@HackersRealm ปีที่แล้ว
try this, sns.countplot(x='Gender', data=df)... It's due to update in seaborn package.
@dr.mahaboobbasha1074 ปีที่แล้ว
Sir..will it possible to get the python code..of this and other videos
@HackersRealm ปีที่แล้ว
It's available in the github repo, link in the description
@snrmedia8965 3 ปีที่แล้ว
How you directly fill with mean in loan amount why not check outlier
@HackersRealm 3 ปีที่แล้ว
To handle outlier, used log transformation
@oushnik 3 ปีที่แล้ว
Another question...why feature scaling is not working here?
@HackersRealm 3 ปีที่แล้ว
we can use feature scaling too. There are various preprocessing methods to use and get insights.
@oushnik 3 ปีที่แล้ว
Can I segregate and train the model instead of using log function? Or else It's necessary to use Log function in this whole project. And 1 more confusion as I'm new so what is the agenda of this whole project? I know it sounds like silly but please explain me.
@HackersRealm 3 ปีที่แล้ว
We are trying to predict whether a person can get loan or not from the bank. And log transformation is not compulsory, you can use other methods
@oushnik 3 ปีที่แล้ว
@@HackersRealm hmm so I used the same as previous then it's ok...another thing why feature scaling is not working here???
I'm getting error like this
"TypeError: float() argument must be a string or a number, not 'StandardScaler'"
@rahulgaddam7110 4 ปีที่แล้ว ⁺¹
how to remove -inf total income coapplicantincome i was tried but not couldn't resolve it.pls help
@HackersRealm 4 ปีที่แล้ว
If you are using log transformation, try like this - np.log(1+df['name']), it will solve the problem
@akhilkrishna8521 4 ปีที่แล้ว
np.seterr(divide = 'ignore')
train['CoapplicantIncomeLog'] = np.where(train['CoapplicantIncome']>0, np.log(train['CoapplicantIncome']), 0)
this will solve your problem
@mitali3j 3 ปีที่แล้ว
But after adding 1 then in the graph generated, I can see 2 bell curves....
What does that mean?
@iamrahul2944 4 ปีที่แล้ว
sir, i am not able to add new column getting error as
my code: data['total_income']=data['ApplicantIncome']+['CoapplicantIncome']
@HackersRealm 4 ปีที่แล้ว
it's data['CoapplicantIncome'], please check the syntax
@PravinKumar-zc2eq 4 ปีที่แล้ว
Can u tell how to train LogisticRegression model??🙏
@HackersRealm 4 ปีที่แล้ว
i think i have explained how to train logistic regression also, could you please check the video again.
@PravinKumar-zc2eq 4 ปีที่แล้ว
@@HackersRealm sorry I mean to say that how to tune the LogisticRegression model
@HackersRealm 4 ปีที่แล้ว
ok, i didn't cover hyperparameter tuning, it will take a complete video for that. I will try to post the videos for that in future
@kumarsanjibray9415 3 ปีที่แล้ว
sns.distplot is working but not showing the graph properly ..could u tell me what to do??
@HackersRealm 3 ปีที่แล้ว
try specifying the x, y values properly
@kumarsanjibray9415 3 ปีที่แล้ว
@@HackersRealm How to specify them ??...Tell me If u can
@HackersRealm 3 ปีที่แล้ว
@@kumarsanjibray9415 seaborn.pydata.org/generated/seaborn.distplot.html try this documentation
@niklausmikealson3115 4 ปีที่แล้ว ⁺¹
I didn't understand where it's shown how many people are approved for loan and already
@HackersRealm 4 ปีที่แล้ว
In the dataset itself, it is clearly mentioned, please use head function to see the labels
@niklausmikealson3115 4 ปีที่แล้ว
What is the goal pls tell
@HackersRealm 4 ปีที่แล้ว
@@niklausmikealson3115 based on the attributes of the person, we need to find whether they are eligible for loan
@VickyKumar-sg3jc 3 ปีที่แล้ว ⁺¹
so helpful
@HackersRealm 3 ปีที่แล้ว ⁺¹
Glad you liked it!!!
@VickyKumar-sg3jc 3 ปีที่แล้ว
@@HackersRealm thankyou sir for responding
I am getting error on preprocessing labelencoder
Typeerror:not supported between instances of str and float
@HackersRealm 3 ปีที่แล้ว
@@VickyKumar-sg3jc I think in one column you have float and string values, Please check the type of data
@ranjangowda9878 3 ปีที่แล้ว
Hello,
Can use this project as my mini project.??
@HackersRealm 3 ปีที่แล้ว
yes, you can
@lalithkishorep2618 9 หลายเดือนก่อน
How u say that imputed with mean??
@HackersRealm 9 หลายเดือนก่อน
which part you're referring?
@mohmmedshahrukh8450 4 ปีที่แล้ว ⁺¹
but in your result doesnt shown any where who are eligble or not
@HackersRealm 4 ปีที่แล้ว
If you check the y label, it will be there
@niklausmikealson3115 4 ปีที่แล้ว
How to check y label
@nandinijain4461 3 ปีที่แล้ว
From where we can download the dataset can you provide link or dataset in zip format
@HackersRealm 3 ปีที่แล้ว
links are in the description
@HossainRabin 4 ปีที่แล้ว ⁺¹
Excellent tutorial but you mispronounced left-skewed and right-skewed data. Appreciate your effort.
@HackersRealm 4 ปีที่แล้ว
Yes, you are right. I will correct it next time. Thanks for watching the video
@varunrokade1617 4 ปีที่แล้ว
can some one tell me what is the currency of applicant income and the other amount (currency) in this data set
@PravinKumar-zc2eq 4 ปีที่แล้ว ⁺¹
It's in dollars
@varunrokade1617 4 ปีที่แล้ว
@@PravinKumar-zc2eq thank you !!
@be_it_b_76_saurabhyadav36 3 ปีที่แล้ว
@@PravinKumar-zc2eq Is it in dollars after log transformation? because before log transformation for example in 1st row applicant income was 5489 then it became 8.67. What if i want income like it was in original dataset? im guessing it was in rupees before log. kindly help if u know praveen.
@HackersRealm 3 ปีที่แล้ว
No, it's in dollars all the time, I have done some data preprocessing on that, that's why the values are small after that. That will be helpful in getting good results
@zainabkhalil268 2 ปีที่แล้ว
is there any way of connecting with you via email etc?
@HackersRealm 2 ปีที่แล้ว
you can reach me via linkedin or instagram, links are in the description
@quincykao749 2 ปีที่แล้ว
Is it possible if you can add subtitles
@HackersRealm 2 ปีที่แล้ว
It may automatically generated by youtube
@quincykao749 2 ปีที่แล้ว
@@HackersRealm it is not avalible for some reason
@naveengodara6777 3 ปีที่แล้ว
hi...needed some help for loan prediction workshop...could you please help
@HackersRealm 3 ปีที่แล้ว
please reach me via insta or linkedin
@naveengodara6777 3 ปีที่แล้ว
@@HackersRealm texted on instagram...please have a look
@akhilkrishna8521 4 ปีที่แล้ว
at line number 23 u havent done sns.distplot for coaplicant so u have done wrong ??
@HackersRealm 4 ปีที่แล้ว
I have done for coapplicant income, check 16th minute of video. But mistakenly plotted applicant income, sry for that.
@binduskumar3201 4 ปีที่แล้ว
Hi,
You are doing a good job....thanks for the video....
there is a mistake while plotting the distplot of 'CoapplicantIncome'
Instead of 'CoaaplicantIncome' you have choosen 'ApplicantIncome'....
@binduskumar3201 4 ปีที่แล้ว ⁺¹
And one more thing, we cannot apply log function to 'CoapplicantIncome' since it contains zero value....
@HackersRealm 4 ปีที่แล้ว ⁺²
If you are using log transformation, try like this - np.log(1+df['name']), it will solve the problem
@HackersRealm 4 ปีที่แล้ว
Yes, my mistake. Sorry for the error
@gautamranafounderofbexpert6539 4 ปีที่แล้ว ⁺¹
Thank you , I found helpful same
@HackersRealm 4 ปีที่แล้ว
You're welcome!!!
@mdtahsinkhan242 3 ปีที่แล้ว
Sir Plzz provide the data set
@HackersRealm 3 ปีที่แล้ว
Check in the github link
@siddharthlasiyal4037 4 ปีที่แล้ว ⁺³
Thank uuuuu boss
@tusharpandey3566 4 ปีที่แล้ว ⁺²
Very helpful
@HackersRealm 4 ปีที่แล้ว ⁺¹
Glad it was helpful!!!
@lumdevsawarkar4497 ปีที่แล้ว
outliers detection
@HackersRealm ปีที่แล้ว
There is a separate video in ml concepts playlist, You can check that out!!!
@Mandarpatil091 2 ปีที่แล้ว
check cell no. 23

ต่อไป

เล่นอัตโนมัติ

Titanic Dataset Analysis (Classification) | Kaggle | Machine Learning | Python