I think decision tree may not handle imbalanced data everytime. Because we are splitting each node using a metric(gini/info gain) .The goal of the split is to ensure the classes in each split of the node is very different. For example : We have 100 datapoints out of which 5 datapoints belong to one class. Now if the node splits as 50 :50 ratio and all the 50 classes of majority class is in one split . The info gain will be high but we will not be able to recognize those 5 samples of minority class. Yes, no doubt at some point those 5 classes will be classified but it will be deep tree with lots of levels in it. Which means we may overfit. So we can use class weights /upsampling /smite in this case.
First time i am seeing krish sir wearing interviewer hat. After watching i guess you enjoyed a lot this process. I have just attended couple of interviews in past. But you both guys are so chill. Hope one day i will get the chance of sitting in front of you.
And also for the outliers, if you have two or more hidden layers, it will be resilient to outliers and also input noise. You can very well check this by implementing it.
Nice one,one suggestion can u write subtitles like correct answer if answered correctly and false if the student says wrong answer.it would be really helpful.
Pretty helpful.. Thanks for ur efforts.. But one suggestion here... Unlike real interviews where once candidate answers about his project etc they go in depth about that particular topic and then the interview opens up.. For example here when Krish sir asked about ticket level like P1, P2 etc priority levels the candidate said he didn't consider them which is a lil bit surprising considering all ticket system has an escalation level.. So there he could have gone in depth and would have asked what are the important features then and how he determined etc... Even when Sudhansu Sir asked which algorithm you have used he said Random forest regression but then he should have asked why this technique and preprocessing techniques etc. Rest everything was very informative but the only thing is in interviews from our projects only they go in depth and then it opens up. Keep up the good work and thanks for such fruitful sessions 👏
Bro 😅 what all things go in the mind of a candidate you don't know ... is the guy prepared or not you don't know so saying these things is easy while watching it but facing the same thing and doing a live interview in front of 700 people is not an easy task ... yeah I agree what you said is a good approach but it all comes when the mindset is set in such context okay this is going to like candidate is confused if its a open discussion or interview so yeah its tough to do these at live
These interviews are making me overconfident! Lmao!! I'm still in second year! Looking forward to work even harder to get a job at a good product based company! These videos are really helpful!
I would have to disagree with the part where you had mentioned decision tree is not sensitive to imbalanced classes. It can handle the imbalanced classes well but still they are influenced by the imbalanced classes (information gain and the Gini measure are skew sensitive). You have to use a weighted decision tree or GBDT to tackle this issue.
No they’re not , there’s no such thing as formula coefficients on which they’re being trained , so it’s just kind of if else statement for every condition and imbalanced dataset is just going to get passed through these nodes
Thank u so much Sir.Your effort in making the step by step guide to start DataScience is really apparent .I have a doubt.When should i start with Kaggle (at which point during this excellent curriculum shud i start Kaggle)...Pls reply sir..... I'm expecting a reply grom u sir
Hi Krish, I have seen 2-3 videos of yours and that motivated me to learn DS and ML. Currently I have C++ profile having 9 Years of experience and I am interested in Data Science and ML. Is there any scope that I can move into a DS and ML profile? How can I achieve this into 3-4 months?
Why people jump directly to deep learning? In the last fresher interview, the guy didn't worked on machine learning algorithms and now this guy has only done computer vision part in deep learning? why this is happening? why everyone is not doing each & ever part of machine learning & deep learning? Just wanted to know that why people do this. I don't think so that it's a good practice
It's upon each person perspective it's not like learn machine learning and then go deep learning as deep learning is independent of machine learning shallow model algorithms... And Deep learning is quite more interesting than Machine Learning... The main thing is Krish sir is more towards Machine Learning so he always ask those type of questions... And the candidate is more towards deep learning so yeah what you see you will tell these guys don't know ML stuff and all if Krish sir would be asking DL question from the start then you would never make such opinion ...its what you see is always not true I would say.
Probably in companies they ask both ML and DL and that is what I have done. Candidates should focus on both. Otherwise they will not be able to decide whether to go with ML or DL and I am not always towards ML :)
One should learn everything but they can focus on one side only that interest them..its very broad stream and u cannot manage to know every detail in every domain. It comes with over year after experiencing many tasks.
@@krishnaik06 Thank u so much Sir.Your effort in making the step by step guide to start DataScience is really apparent .I have a doubt.When should i start with Kaggle (at which point during this excellent curriculum shud i start Kaggle)...Pls reply sir..... I'm expecting a reply from u sir
Sir I am doing actuarial sciences, although it’s amazing but it’s mostly applied. I am particularly interested statistics in pure form because, whenever I apply, I would like to know that I know it’s reasoning from core. Please make a video recommending step by step guide through books how to study statistics from books in its pure form.
I think decision tree may not handle imbalanced data everytime. Because we are splitting each node using a metric(gini/info gain) .The goal of the split is to ensure the classes in each split of the node is very different. For example : We have 100 datapoints out of which 5 datapoints belong to one class. Now if the node splits as 50 :50 ratio and all the 50 classes of majority class is in one split . The info gain will be high but we will not be able to recognize those 5 samples of minority class. Yes, no doubt at some point those 5 classes will be classified but it will be deep tree with lots of levels in it. Which means we may overfit. So we can use class weights /upsampling /smite in this case.
*smote. Also can use stratified crossvalidation with randomforest for imbalanced data.
U have good knowledge 😊👍
@@MsRAJDIP yea..stratified k fold...
How. There is 50-50 split as there are 95 majority samples and only 5 minority?
Exactly @Madhavan, perfect example. I am not convinced how decision tree can work fine for imbalanced data.
It will be really helpful if @Krish can elaborate on this
The decision tree algorithm is effective for balanced classification, although it does not perform well on imbalanced datasets.
First time i am seeing krish sir wearing interviewer hat. After watching i guess you enjoyed a lot this process. I have just attended couple of interviews in past. But you both guys are so chill. Hope one day i will get the chance of sitting in front of you.
These interviews boosted my confidence that i can become Data Scientist.
After seeing all the interview ,I get confidence that I am very good at data science concept
And also for the outliers, if you have two or more hidden layers, it will be resilient to outliers and also input noise. You can very well check this by implementing it.
I can answer almost all answer asked by Sudhanshu sir..
29:58 Decision tree will be sensitive to imbalanced dataset because entropy calculation is affected if data is imbalanced.
Correct me if i am wrong.
yes thinking the same. Decision tree should be sensitive to imbalance dataset :\
Nice one,one suggestion can u write subtitles like correct answer if answered correctly and false if the student says wrong answer.it would be really helpful.
Pretty helpful.. Thanks for ur efforts..
But one suggestion here...
Unlike real interviews where once candidate answers about his project etc they go in depth about that particular topic and then the interview opens up..
For example here when Krish sir asked about ticket level like P1, P2 etc priority levels the candidate said he didn't consider them which is a lil bit surprising considering all ticket system has an escalation level.. So there he could have gone in depth and would have asked what are the important features then and how he determined etc...
Even when Sudhansu Sir asked which algorithm you have used he said Random forest regression but then he should have asked why this technique and preprocessing techniques etc.
Rest everything was very informative but the only thing is in interviews from our projects only they go in depth and then it opens up.
Keep up the good work and thanks for such fruitful sessions 👏
Bro 😅 what all things go in the mind of a candidate you don't know ... is the guy prepared or not you don't know so saying these things is easy while watching it but facing the same thing and doing a live interview in front of 700 people is not an easy task ... yeah I agree what you said is a good approach but it all comes when the mindset is set in such context okay this is going to like candidate is confused if its a open discussion or interview so yeah its tough to do these at live
I am amazed with the initiative in sharing information and knowledge about dat science. Thank you very much.
These interviews are making me overconfident! Lmao!!
I'm still in second year! Looking forward to work even harder to get a job at a good product based company! These videos are really helpful!
Haha Over confident 😂
why product based bro why not serivce based please tell me
I would have to disagree with the part where you had mentioned decision tree is not sensitive to imbalanced classes. It can handle the imbalanced classes well but still they are influenced by the imbalanced classes (information gain and the Gini measure are skew sensitive). You have to use a weighted decision tree or GBDT to tackle this issue.
No they’re not , there’s no such thing as formula coefficients on which they’re being trained , so it’s just kind of if else statement for every condition and imbalanced dataset is just going to get passed through these nodes
22:35 😝😝😝 Sudhanshu sir to the point.
at 26:13 the best answer would be if you are using the milk for tea than water otherwise milk :P
Sir Please makes video on Mathematics behind on SVM Regression, AdaBoost Regression, Gradient Boost Classification
Interview to aacha he tha But last ke 10min Jordar the😀
Thank u so much Sir.Your effort in making the step by step guide to start DataScience is really apparent .I have a doubt.When should i start with Kaggle (at which point during this excellent curriculum shud i start Kaggle)...Pls reply sir..... I'm expecting a reply grom u sir
very helpful.... tnxs krish
thanks a lot for making these type of interview
We are only looking for binary answer(punchline by Sudhanshu sir)...love the way interview conducted..
loved it!
Hi Krish,
I have seen 2-3 videos of yours and that motivated me to learn DS and ML.
Currently I have C++ profile having 9 Years of experience and I am interested in Data Science and ML. Is there any scope that I can move into a DS and ML profile? How can I achieve this into 3-4 months?
Yes u can definitely do it with proper preparation
@@krishnaik06 Could you guide me for this? How should I start?
Why people jump directly to deep learning? In the last fresher interview, the guy didn't worked on machine learning algorithms and now this guy has only done computer vision part in deep learning? why this is happening? why everyone is not doing each & ever part of machine learning & deep learning?
Just wanted to know that why people do this.
I don't think so that it's a good practice
It's upon each person perspective it's not like learn machine learning and then go deep learning as deep learning is independent of machine learning shallow model algorithms... And Deep learning is quite more interesting than Machine Learning... The main thing is Krish sir is more towards Machine Learning so he always ask those type of questions... And the candidate is more towards deep learning so yeah what you see you will tell these guys don't know ML stuff and all if Krish sir would be asking DL question from the start then you would never make such opinion ...its what you see is always not true I would say.
Probably in companies they ask both ML and DL and that is what I have done. Candidates should focus on both. Otherwise they will not be able to decide whether to go with ML or DL and I am not always towards ML :)
@@krishnaik06 :-) you are towards making choice instead of going only one way. Great work👍
One should learn everything but they can focus on one side only that interest them..its very broad stream and u cannot manage to know every detail in every domain. It comes with over year after experiencing many tasks.
@@krishnaik06 Thank u so much Sir.Your effort in making the step by step guide to start DataScience is really apparent .I have a doubt.When should i start with Kaggle (at which point during this excellent curriculum shud i start Kaggle)...Pls reply sir..... I'm expecting a reply from u sir
Sir I am doing actuarial sciences, although it’s amazing but it’s mostly applied.
I am particularly interested statistics in pure form because, whenever I apply, I would like to know that I know it’s reasoning from core. Please make a video recommending step by step guide through books how to study statistics from books in its pure form.
Please take interview for fresher also
yes
Link to Sudhanshu sir's channel
Aic SC when comparing between diffrent models lm
that is wrong explanation of why resnets have skip connections
sir please dont use probably ....!
😂 probability is everywhere
Not sure why dislike comments, interested to understand the reason from those people's
This guy surely need to brush up his basics
He doesn't know anything about this at all.
These interviews made me think that I will be a good data scientist one day! I can do a lot better than that!
The first thing I observe is that their English is really bad