Live Virtual Mock Interview Of Statistician IIT Kanpur For Data Science

แชร์
ฝัง
  • เผยแพร่เมื่อ 10 ธ.ค. 2024

ความคิดเห็น • 378

  • @krishnaik06
    @krishnaik06  4 ปีที่แล้ว +135

    Hlo Sir, thanks for that interview session that helped me in several ways. Today, I got job offer as Associate software engineer AI.
    This was the message by Sahil within a week after this mock interview.Congratulations👍👍

    • @carlmarx6436
      @carlmarx6436 3 ปีที่แล้ว +3

      howwww!!!!

    • @ameerazam3269
      @ameerazam3269 3 ปีที่แล้ว +1

      Logistic for multiclass lol ...

    • @nik7867
      @nik7867 2 ปีที่แล้ว

      @@carlmarx6436 tag bro 😂

    • @harshmehta6321
      @harshmehta6321 2 ปีที่แล้ว

      Hahahaha

    • @inFamous16
      @inFamous16 2 ปีที่แล้ว

      @@carlmarx6436 He was nervous throughout the interview. Also there is always a luck factor that plays an important role while interviewing in different companies.

  • @dheerajsharma5492
    @dheerajsharma5492 4 ปีที่แล้ว +152

    Now this is called real interview. And this is how normally interviewee behaves while interview. Thank you for sharing this with us. 😊

  • @tawhidibnwahid4825
    @tawhidibnwahid4825 4 ปีที่แล้ว +72

    I'm also a so called statistics student.
    After seeing this interview, I really think I should study

  • @bheeshmak.s5125
    @bheeshmak.s5125 3 ปีที่แล้ว +41

    I just realized my interviewer conducted entire interview today by picking questions from this mock interview.. I should have watched it earlier to give better answers.. What a coincidence 🤣

  • @justanotherpoet125
    @justanotherpoet125 4 ปีที่แล้ว +48

    Please stop posting negative comments on the candidate. Atleast he was gracious enough to allow this interview to be posted. This is how people learn. Not everybody knows everything even if they r learning or have learnt in IIT.

  • @antonyamalrajmorais160
    @antonyamalrajmorais160 4 ปีที่แล้ว +93

    From my experience, this is the best replica of real time data science interviews in companies. They dont go to ML. Theyll finish you in statistics itself.

    • @snehangshubhattacharjee8072
      @snehangshubhattacharjee8072 4 ปีที่แล้ว +15

      If you think this is the best replica of real time data science interview, I am very sure about the kind of interviews you have attended!
      Get a life!

    • @snehangshubhattacharjee8072
      @snehangshubhattacharjee8072 4 ปีที่แล้ว +1

      @0_0 There are various other comments which would answer your query.

    • @anmol3457
      @anmol3457 4 ปีที่แล้ว +11

      @@snehangshubhattacharjee8072 ah here we have a typical person on the internet who thinks he knows the other person behind a screen. Get a life instead of trying to be a smartass on the internet ;)

    • @anmol3457
      @anmol3457 4 ปีที่แล้ว +1

      @0_0 perhaps he's frustrated with his life. But I'm not him so I'm not gonna assume anything.

    • @snehangshubhattacharjee8072
      @snehangshubhattacharjee8072 4 ปีที่แล้ว +2

      @@anmol3457 @Anmol I hope you realise that a interview in which a young boy is getting bullied by two people who has a little bit more experience than him and everyone else celebrating that by calling it 'best replica' and people like you coming in support of that is a beautiful example of 'human stupidity'.
      However, I would really like to thank you for putting forward such beautiful examples.

  • @ashishmishra-sz9sm
    @ashishmishra-sz9sm 4 ปีที่แล้ว +59

    It's not about college, knowledge is a practice . The more u practice and be in live scenarios u learn more.

  • @sagargupta8084
    @sagargupta8084 4 ปีที่แล้ว +35

    Can't believe if questions are going to be this straight forward in an interview setting!

  • @sairamgajavelli5125
    @sairamgajavelli5125 4 ปีที่แล้ว +54

    Questions
    1.When everyone talks about sample at that point of the time generally we use to take n-1 in the denominator ..so what is the reason behind that?
    1.1 Have you heard of Bessels correction?
    2. Can you please tell me a scenario of where I can apply CLT !? Effectively and able to get results out of it ?
    2.1 what are the outcome of statistical tests?
    2.2 what are the outcomes of normal distribution? What assumptions do you make from Normal distribution?
    3.If data is given to me and do statistical analysis to find out outliers and what is the approach u are going to take to make it a normally distributed data set?
    3.1 approach to find out and approach to Handle the outliers ?
    3.2 without discarding how will you handle data?
    3.3 what do you understand about normalisation?
    3.4 what is the difference b/w normalisation and standardisation?
    4. What is diff b/w Z statistics and T statistics where we are supposed to use that which one you r suppose to use and which one your are not supposed to use !?
    4.1 can you please talk about diff b/w z and T in terms of standard deviation?
    Machine learning concepts :
    5. Why root mean square error some times call as a worst errored calculations in case of regression!?
    5.1 what is the probable solution for where we have one data in one scale and other as another scale
    6. Can you please talk about error function or loss function in logistic ?
    7. What’s your fav algorithm?
    7.1 if you have to teach me a linear or logistic regression happen how you gonna explain me ?
    8. you’ve worked on time series just tell me how you gonna do train test split ?
    8.1 why sequential arranged ? (Time series)
    8.2 will there be too much impact if the data has lot of variations!?
    9.how do you decide no of clusters?( k mean )
    9.1 if you under sample the data what are the results (ans: loss of information)
    9.2 what does over sampling do ?
    9.3 have you tried any class weights !?
    9.4 which algorithms you applied for supervised ?
    9.5 are there any algorithms which will not get impacted by imbalance datasets ?
    10. If have a problem statement you are a placement co ordinator you are facing a challenges there are many companies which used to visit your colleges and every company is having same set of requirements as a placement co ordinator you get stuck between company and student ! There are 1000 of resumes you have to map a correct resume to correct company ?
    You have to solve it using Ml not NLP!?
    10.1 if you take Dt how you are going to define classes
    11.you have done one hot encoding to change categories in to numerical what if you have a category like pin code ?
    11. 1Why do we convert categories to numerical
    12.You’ve built a NN how do u decide how many no of layers and hidden neurons to be used?(ans: keras tuner )
    12.1 why k fold cross validation is used?
    12.2 what does cv is?

  • @hstation6486
    @hstation6486 4 ปีที่แล้ว +19

    51:00 we can use clustering and then depending on skillsets we can cluster them and sort the resumes

  • @mizgaanmasani8456
    @mizgaanmasani8456 4 ปีที่แล้ว +17

    We use t-test when std deviation is not known used for comparing 2 population means pair or independent...and Z- test is simple..!!

  • @dheerajrajput6709
    @dheerajrajput6709 ปีที่แล้ว +3

    After watching this interview mock test my confidence level has been boosted.

  • @AnujKumar-zl7jc
    @AnujKumar-zl7jc 3 ปีที่แล้ว +22

    I have done M.phil in statistics & from this live interview session I understand ... Interviewer focused on basic statistics and I did this all statistics in graduation. So I think Data science is not hard for Statistics students 😉.....

    • @vikneswarankeerththanan9163
      @vikneswarankeerththanan9163 2 ปีที่แล้ว +2

      , lol , now days focus on Machine learning and deep learning when you consider those things, you have to have good understanding about Algorithm,

  • @pranavgarg4124
    @pranavgarg4124 3 ปีที่แล้ว +29

    Hi, this mock interview was very good. You guys asked simple questions, I knew most of the answers but explaining them properly is also important. Just a request Krish, when you think that you are not getting a proper answer then please give the right solution too, it can be brief. Thanks

    • @Saurabhyadav-yb2oo
      @Saurabhyadav-yb2oo 3 ปีที่แล้ว

      Hi hope you are doing well. Actually I am looking for someone to practice and prepare data science interview+ improving communication. My background is in ECO and statistics. Currently I am pursuing MSc in quant eco.

  • @abhinavawasthi6262
    @abhinavawasthi6262 4 ปีที่แล้ว +15

    Congratulations Sahil Saini..it's so motivating to see you here.

  • @VivekKumar-je3pg
    @VivekKumar-je3pg 4 ปีที่แล้ว +7

    Outliers can be detected by Inter Quartile Range.
    All values out of range Q1-1.5IQR & Q3+1.5IQR are outliers.

  • @FullfatIcecrem
    @FullfatIcecrem 4 ปีที่แล้ว +12

    Wow some hardcore fundamentals discussion and very conceptual as well

  • @SudhanshuMohanty
    @SudhanshuMohanty 4 ปีที่แล้ว +2

    Data science k coaching se jitna sikhne ko nhi mila..Usse kahi jyada sikhne ko idhar milgaya..thank you

  • @utkarshsharma7336
    @utkarshsharma7336 4 ปีที่แล้ว +60

    "Yeah you can have 30 seconds."
    All the interwier is doing is imposing his thought process on the student, trying to have a kind of dominance.
    It's bad example of an interview.

  • @janakchetri7731
    @janakchetri7731 4 ปีที่แล้ว +20

    @krish the virtual interview is good. If you could also share the candidates resume or CV for our reference. So that it would be beneficial for aspiring DS like me and others for preparing for future upcoming days.🙏

  • @anig8298
    @anig8298 4 ปีที่แล้ว +30

    I am working in ML more than 8+ years, from my experience I can say,even if candidates in prestigious school perform poor it does not matter! ultimately they are the only one who will go to FANG due to college brand, other good candidates will only keep applying to these companies and will get filter out in HR screening

    • @shivamkrathghara3340
      @shivamkrathghara3340 4 ปีที่แล้ว +6

      bat to aapki 100 percent sahi hai....
      Recently Bank of America ki job aayi but they mentioned only graduates and postgraduates from iit are allowed....

    • @anig8298
      @anig8298 4 ปีที่แล้ว +10

      @@shivamkrathghara3340 Ya they will choose Metallurgy, Civil, Mining ,Biotech or guy from any branch with these so called prestigious colleges who hardly have any interest in his/her core branch for Data Science job but will not give fair chance to students of other universities who are equally talented, hard working and from relevant streams

    • @annamdurgashivaprasad4041
      @annamdurgashivaprasad4041 4 ปีที่แล้ว

      Yaa this is what happening in present situation

    • @varunmanjunath6204
      @varunmanjunath6204 4 ปีที่แล้ว

      @@annamdurgashivaprasad4041 this only happens in India. In countries like usa they test your skills

  • @mathavraj9378
    @mathavraj9378 4 ปีที่แล้ว +4

    The technique of standardization and normalization both are used for feature scaling. Standardization we do when we know our data is normally distributed and our model makes that assumption like in linear regression. .normalization techniques, like min max we do if our model does not make any assumption about the data distribution like neural networks

  • @yasharthsingh805
    @yasharthsingh805 4 ปีที่แล้ว +7

    Normalization : between 0 to 1
    Standardization : mean 0 , SD 1 (preferred , because in normalization we might loose significant outliers)
    Logistic Regression : We can use for multiclass classification (one vs rest scheme)
    loss func : Cross Entropy

    • @rafibasha4145
      @rafibasha4145 2 ปีที่แล้ว

      How does normalization remove outliers

    • @ArunYadav-qq1cj
      @ArunYadav-qq1cj 2 ปีที่แล้ว

      @@rafibasha4145 it doesn't..instead it just remove the skewness by normalizing that data

    • @rafibasha4145
      @rafibasha4145 2 ปีที่แล้ว

      @@ArunYadav-qq1cj ,it shouldnt even transform the distribution i belive

  • @salonigandhi4807
    @salonigandhi4807 4 ปีที่แล้ว +116

    Just watched the first few minutes !! The interviewer needs to give the kid a break ! Rather than encouraging him, he is more focused on undermining his thought process. shame !

    • @akshaysaxena7920
      @akshaysaxena7920 4 ปีที่แล้ว +17

      Krish was still okay and encouraging the candidate but Sudhanshu might have good amount of knowledge but the pride he has, is not favorable.

    • @rahulkrishna1657
      @rahulkrishna1657 4 ปีที่แล้ว +2

      You are right - agree with you.

    • @swatantrachib5916
      @swatantrachib5916 4 ปีที่แล้ว +9

      This is how most interviews are, companies will go through a large pool of candidates for a position and will not be so liberal with their time for each candidate.

    • @rdbokhi
      @rdbokhi 4 ปีที่แล้ว +9

      This is so typical of Indian Interviewers. Grilling an interviewee isn't going to help!

    • @rahulkrishna1657
      @rahulkrishna1657 4 ปีที่แล้ว +2

      @@swatantrachib5916 I don't think so - it won't go this much longer.

  • @asadali4153
    @asadali4153 4 ปีที่แล้ว +9

    great interview .. keep posting mock interview .but sir also keep telling answer or hint so v can further research it

  • @ananthakrishnans4951
    @ananthakrishnans4951 4 ปีที่แล้ว +26

    51:50 me in every interview

  • @Mere_Shivjii
    @Mere_Shivjii 4 ปีที่แล้ว +5

    Inter quartile range, and box plot are most common methods for outlier dtection. And capping the outlet at 95 or 98th percentile should work for handling outlier

  • @asadali4153
    @asadali4153 4 ปีที่แล้ว +8

    dil kush hogya interview dekh ker... great help for u...Tumb up for sahil for facing the master of datasciience

  • @tintintintin576
    @tintintintin576 4 ปีที่แล้ว +85

    Sahil, you're doing great and I believe that this interview gonna push you to do MORE GREATER things in life.
    All the best buddy!

    • @axisinfo819
      @axisinfo819 4 ปีที่แล้ว +5

      This guy won't be same person after this interview.

    • @varunmanjunath6204
      @varunmanjunath6204 4 ปีที่แล้ว +7

      He doesn't know half the stuff. His basics are not clear.

    • @shafqat1138
      @shafqat1138 4 ปีที่แล้ว +9

      He doesn’t know half the stuff undergraduate students at majority of the top universities are expected to have mastered by the end of their 2nd year Mathematics/Statistics/Econometrics courses (at most). For example the difference between a Students T and Z score is that Students uses the sample standard deviation (because the population variance is unknown) incurring a penalty in degrees of freedom whereas the Z score is calculated using a known population variance. This is taught in 1st year and as a professional Statistician it’s painful to watch him go through this mock interview.
      In saying that I harbour no disrespect towards the IITs as I still believe them to be premier engineering schools however this is not the quality of students I’d like to study with if I were doing my masters in Statistics there.

    • @sastashroud7646
      @sastashroud7646 4 ปีที่แล้ว

      @@shafqat1138 what should i do to have a solid foundation in machine learning or data science ???? any suggestions because i am just getting started and i am in final year of engg would be helpful...should i focus more on mathematical part ???

    • @shafqat1138
      @shafqat1138 4 ปีที่แล้ว +4

      @@sastashroud7646 Depends on how strong your background in Mathematical Statistics, Inference and Analysis (Real/Complex) is. I'm not a big fan of the whole "Machine Learning/Data Science" gimmick because these "Algorithms" have been around for quite a while in mainstream Econometrics/Statistics/Actuarial Science. It's just that computer aided Data Visualization tools have given ML Algorithms a bit more of a crunch as they're now more prone to being understood by your average non-Technical individual.
      My suggestion would be to focus on Big Data Analytics (Hadoop Architecture, Database Design & Management, et al) cause they are going to be the thing of the future. Big Data is in direct conflict with Classical Statistical Inference in that it allows real time processing of Population Scale Data whereas Classical Statistical Inference focused on using Sample Parameters to estimate Population Parameters.
      So my suggestion would be to focus on Big Data Analytics rather than Data Science.

  • @Ashish2331991Mr
    @Ashish2331991Mr 4 ปีที่แล้ว +15

    Now that tells us how important are the basics to the ground i really like how you guys ripped it apart at the basics and the approach you guys took towards applications of a basic and then build it up gradually and slowly towards the end and for sahil from my past experience I would say that it was all show at the beginning and then it turned out to be a no go but he should have appeared for this when he was prepared to some extent atleast

  • @spadbob24
    @spadbob24 3 ปีที่แล้ว +4

    My man nailed it - successfully answered most of questions wrong 😂😂 worst part is most of these are covered as a part of Krish Naik channel

  • @KiranKumar-eu2wu
    @KiranKumar-eu2wu 2 ปีที่แล้ว +1

    34:54 sequentially arranged because the correlation with the recent lags will be higher and that will help us to arrive at the accurate future values. The correlation decreases exponentially from recent lags to past lags.

    • @newbie8051
      @newbie8051 ปีที่แล้ว

      doing a refresher course on stats teahces all this, idk what bullshit has this senior read about.

  • @chiragbhavsar6134
    @chiragbhavsar6134 4 ปีที่แล้ว +29

    after looking at people finding interviewer rude i think people haven't given any vivas

    • @yo-yd9ng
      @yo-yd9ng 4 ปีที่แล้ว

      Ha bhai yaha aise hi hota hai 😂😂

  • @kakumanuavinash7645
    @kakumanuavinash7645 3 ปีที่แล้ว +3

    Hii guys..your interviews are good..one REQUEST is ..please do not smile or laugh when the candidate is not able to answer the questions..

  • @phanindraparashar8930
    @phanindraparashar8930 4 ปีที่แล้ว +2

    Ford Fulkerson algorithm with super sourse and super sink. Solve for max flow. Connect students to source and companies to sink. If student is interested in company have a link. Flow capacities b/w student and company is inf. B/w source and student is 1. B/W sink and company is 1.

  • @beyond_numbers77
    @beyond_numbers77 2 ปีที่แล้ว +1

    Normalization is rescaling the data in the range of (0,1) we can do it by using min max scaler. Standardization is a form of normalization using z score.

  • @vishalgupta9620
    @vishalgupta9620 4 ปีที่แล้ว +16

    he is having trouble understanding technicality of the questions
    does not think like a programmer

  • @nitheesh340
    @nitheesh340 3 ปีที่แล้ว +1

    Nice Interview. Sahil was completely nervous and couldn't think clearly for most of the questions. He was getting confused between the actual logic and the implementation part. For the resume assistance problem, he kept talking about the algorithms, but he seldom spoke about how the data could be extracted from the resume and how that could be handled. Basically, he somehow kept forgetting about how to extract and label the data and concentrated mainly on the algorithm. Anyway 2 technical interviewers are definitely going to make anyone nervous. Even I would have got nervous and forgotten some of the concepts. At least this was a good experience for him and viewers like me. I am new to this field and started learning Data Science a few months back. Anyway I can conclude that the resume keywords play a key role in the interviews and concepts are more important. Thank you for the wonderful session, Krish Sir and Sudhanshu Sir.

  • @newbie8051
    @newbie8051 ปีที่แล้ว +1

    Tree-based algorithms perform well on imbalanced datasets. Boosting algorithms are ideal for imbalanced datasets because higher weight is given to the minority class at each successive iteration.
    40:01

  • @ashishbhatnagar8682
    @ashishbhatnagar8682 4 ปีที่แล้ว +4

    Big thank you for such videos. They are very useful and informative.

  • @mizgaanmasani8456
    @mizgaanmasani8456 4 ปีที่แล้ว +7

    Good question and are not so tough...!! Wish same question would have been asked to me...!!

    • @7justfun
      @7justfun 4 ปีที่แล้ว +1

      Can you type the answers you know. Will be helpful for all.

  • @rajnigoyal6653
    @rajnigoyal6653 3 ปีที่แล้ว +2

    Good to know type of questions we should prepare for.

  • @Vijay-iq1fh
    @Vijay-iq1fh 4 ปีที่แล้ว +8

    ONE interview can`t decide the level of IITK student, Mr. Sahil is also DPC at IITK. One non IITian person can`t think the effort he is putting in this course to maintain CPI Moreover any IITs are not bad It`s up to you how you want to learn just to get job or to go for PHd towards research or teaching and enhancing the knowledge of the subject. So who all are blaming Mr. Sahil here think once more.

    • @navdeepvarshney6422
      @navdeepvarshney6422 ปีที่แล้ว

      You r 80 percent right. Not all students got the jobs from IITS

  • @arvindsinghrawat538
    @arvindsinghrawat538 4 ปีที่แล้ว +3

    Great Job 👍, and Sahil Did very well

  • @vedshukla659
    @vedshukla659 3 ปีที่แล้ว +4

    The boy was just a bit stressed. I think he did pretty well.

  • @phanindraparashar8930
    @phanindraparashar8930 4 ปีที่แล้ว +4

    Normalization is forcing to behave normally by using transformation like Box Cox

  • @swatantrachib5916
    @swatantrachib5916 4 ปีที่แล้ว +11

    I think the guy getting interviewed seems to not understand what statistics is, as what it is from first principals perspective. He only seems to know how to do something but now why it is done that way .

    • @greedygoblinsgaming2935
      @greedygoblinsgaming2935 3 ปีที่แล้ว +2

      yup i think he just did what he was told ...he didnt ask question why we do. so ?

  • @sushantjillawar7164
    @sushantjillawar7164 4 ปีที่แล้ว +7

    please provide the answers that the guy failed to give and also some detailed answer about the Sub- question like any other way can we do it ...
    Please make a separate video

  • @phanindraparashar8930
    @phanindraparashar8930 4 ปีที่แล้ว +3

    Discrete Choice Modelling is all about using Logistic Regression for Multi class classification. I didn't think there is any such reason of not using it. In our university we are having an entire course about this.

    • @Pushpa_Neupane
      @Pushpa_Neupane 11 หลายเดือนก่อน

      logistic regression always use for binary classification because the sigmoid function=1/1+e^-y, gives you value between 0 to 1, you make limitation if predicted value is 0.5, then it belongs to class 1. there is a 'S' shaped graph of sigmoid function which tell this. that is it.

  • @crackledsp6920
    @crackledsp6920 4 ปีที่แล้ว +23

    Please answer the questions krish sir so that we can learn.

    • @mathavraj9378
      @mathavraj9378 4 ปีที่แล้ว +3

      For the resume question,
      medium.com/@Synerzip/resume-ranking-using-machine-learning-implementation-47959a4e5d8e
      Basically we can use a logistic regression which outputs probability score of suitability of resume. Training data will be resumes selected and rejected in past. That being said this seems to be inefficient to me and unfair, because requirements might have changed from training set to now

  • @raghavverma120
    @raghavverma120 2 ปีที่แล้ว

    First question answer is degree of freedom, basically if you have a sample and you know its mean and you know 9 values, 10th can be calculated so similarly using ‘n’ in the denominator introduces bias,just the way we avoid dummy trap by deleting the first dummy in similar way we decrease n value by 1 to remove that bias

  • @omkarnadkarni7253
    @omkarnadkarni7253 4 ปีที่แล้ว +1

    All the people hatin the interviewer ..that's how mock interview should be .. if you are prepared for the worst then you can do good in any scenario

  • @akashkumar-ld2bw
    @akashkumar-ld2bw 4 ปีที่แล้ว +4

    If Say 20 companies are there and our main task is to classify each resume into two categories that candidate can seat in interview or not, this can be done by selecting features =what skill companies want ,we also need to think about the those candidates who have experience or have done project in specific field compared to those candidates who have only knowledge of that ,so that Candidate with experience should be given preference.Also try to keep no.of features less .Now we can train a model based on all features(input) for each feature we also need to associate no with it like if company want expert of any skill and only basic knowledge is sufficient then I can assign different number ,based on that I can create a model ,just like we define region in LPP like y>=3x1+4x2+..... Something like that and .
    Then I can make 10 models (no. Of companies coming for placement drive) , for each model we have output
    But if there is any constraint that a single candidate can only appear for 3 or 4 interview, then we need to think further.
    Now we have models ready with me ,and I have data as key skill of students , I will pass all these data to our model and i will measure the shortest distance of this data to each contraint line for specific model and sum it over all straight line boundary relating to each constarint like y>=6x1+3x2.
    And we will find such distance for all data(student skill) given in resume of student and then apply scaling so that ranking of student can be done a/c to best suitable canditate ,
    Similar thing can be done for all model .
    But I think In this approach I haven't use ML .
    🤨

  • @phanindraparashar8930
    @phanindraparashar8930 4 ปีที่แล้ว +3

    U can use a convert into flow problem. Its a graph problem. Matching problem.

  • @ramshankarkamat9843
    @ramshankarkamat9843 4 ปีที่แล้ว +2

    Now, it feels like real interview. Great session. Thanks you for sharing it.

  • @raghavverma120
    @raghavverma120 2 ปีที่แล้ว

    2nd question answer is inventory management.. by creating confidence interval we can predict easily the stock for a given month at 90% -95% level

  • @ayushmishra3532
    @ayushmishra3532 4 ปีที่แล้ว +11

    Holy shit! You destroyed the guy!! You should have invited a ug guy! He would have destroyed this interview :)

  • @bhaskargoswami6283
    @bhaskargoswami6283 4 ปีที่แล้ว +10

    In pin code can't we break it down in to components like 1st few characters and like mid and then last component to finally do a mean,target or one hot encoding of each component?

  • @ashishpant3395
    @ashishpant3395 7 หลายเดือนก่อน

    sir apka video dekhta hun 😍😍😍 apne upar confidence a ara hai sir

  • @ranjiranju7759
    @ranjiranju7759 4 ปีที่แล้ว +2

    Great one krish and sudanshu..also plz get somebody from oracle apps (ERP) with 2/3yrs Experience and who has made the transition in DS

  • @ameyascraft947
    @ameyascraft947 4 ปีที่แล้ว +4

    U give too much confidence sir to learn more

  • @shubhamkumar5579
    @shubhamkumar5579 4 ปีที่แล้ว +6

    I think this will clear the use of CLT ..this is the post that I had posted on LinkedIn before some days...hope this will be helpful to understand it..
    CLT states that the distribution of sample means approximates a normal distribution as the sample size becomes larger assuming that all samples are identical in size and regardless of population distribution shape..
    Best part of this theorem in our real life problems is that regardless of distribution(i.e If we haven't idea about distribution of population) shape one can compute the probability of an event for that distribution using samples and CLT theorem
    For a layman who is not from statistics stream
    Can understand it by one simple example-
    Let we want to check what is the probability (i.e chance) that an Indains have income more than 50,000 then we can compute it using Central Limit theorem and sample (i.e a sub part population data) even we haven't idea about the distribution of income of Indians

    • @greatgreek3569
      @greatgreek3569 4 ปีที่แล้ว

      Can I say that if my sample is having a lot of values and I am supposed to be calculating the standard error of a lot of samples(a lot) from the Indian population then I can use the CLT here as calculating the SE for every sample won't be feasible... Right?? M a student.. ignore if I made any wrong assumptions

    • @shubhamkumar5579
      @shubhamkumar5579 4 ปีที่แล้ว

      @@greatgreek3569 What you have written is littile bit confusing...like you are talking about a sample or a group of samples...as per I understood if you are looking to calculate standard error then you can calculate standard error for a single sample or standard error of group of sample means normally ..here CLT need not to be used ..CLT is only used when we haven't idea about the distribution of population and we want to get some information about that population using their sub part like chance of occurrence of an event or something..
      And one more thing you can notice that the idea behind using sample means is that it will estimate population Better than a single sample as it consists of many samples
      Hope you got it..

  • @phanindraparashar8930
    @phanindraparashar8930 4 ปีที่แล้ว +2

    We can also convert it into optimisation problem - linear programming

  • @rangersw3227
    @rangersw3227 4 ปีที่แล้ว +1

    Before commenting that person do know where you are guys. Just encourage these questions are not that easy... Just try to learn

  • @hemantsharma7986
    @hemantsharma7986 4 ปีที่แล้ว +4

    Z score , can only be used when data is normally distributed and also IQR?

    • @arunchowdary4925
      @arunchowdary4925 4 ปีที่แล้ว

      Z is used generally when data is normal, if not normal t test is used

  • @Travel_and_music_shorts
    @Travel_and_music_shorts 4 ปีที่แล้ว +1

    If I join your channel with data science materials can I view members only streams? Or do I need other membership

  • @gauravlodhi5137
    @gauravlodhi5137 4 ปีที่แล้ว +7

    Invite IIIT Delhi peeps to interview. If you want I can be a voulenteer.

    • @krishnaik06
      @krishnaik06  4 ปีที่แล้ว +4

      Sure drop a mail in krishnaik06@gmail.com

    • @gauravlodhi5137
      @gauravlodhi5137 4 ปีที่แล้ว +5

      @@krishnaik06 Okay definitely sir, let me handle my End Sem Exam first ;)

    • @devanshadhikari9085
      @devanshadhikari9085 4 ปีที่แล้ว

      @@krishnaik06 Sir, how about a applied Statistics and Informatics student from IIT Bombay?

    • @gauravlodhi5137
      @gauravlodhi5137 4 ปีที่แล้ว

      @@deepanshuchoudhary4598 Gentleman be patient I'm waiting for a suitable time.

  • @anujshah7350
    @anujshah7350 4 ปีที่แล้ว +2

    I am machine learning beginner and I could literally answer most of questions. All thanks to Applied AI Course online.

    • @HarisRehman
      @HarisRehman 4 ปีที่แล้ว

      Which course

    • @anujshah7350
      @anujshah7350 4 ปีที่แล้ว

      @@HarisRehman search for appliedaicourse on google

  • @chiraglimbachia3731
    @chiraglimbachia3731 4 ปีที่แล้ว +1

    Normalization/standardization does not help treat outliers. They only change the mean and scale of the data. I would like to know if there is any normalization technique that helps treat outlier. Yes, certain transformations can be applied to data to treat outliers.

    • @iamsachinnegi
      @iamsachinnegi 6 หลายเดือนก่อน

      Yes at some I want to clarify u that abs -transform can treat outlier as it also a part of normalisation as some of the people got confuse normalisation as minmaxscaler only that actually normalisation have many sub-category.

  • @lovishkandoi63
    @lovishkandoi63 4 ปีที่แล้ว +14

    Sahil bhai, deep institute mein jo padha tha sab bhool gya kya?? :)

  • @rk-wy8pu
    @rk-wy8pu 4 ปีที่แล้ว +21

    for someone who has an IIT background, I expected him to be able to explain his understanding on the concepts asked about in a more clear manner. Pretty disappointing.

    • @himanshudaksh4957
      @himanshudaksh4957 4 ปีที่แล้ว

      He is a first year student. And people generally learn prob-stat and data mining in 3rd year.

    • @mthetree
      @mthetree 4 ปีที่แล้ว +4

      @@himanshudaksh4957 he is a masters student in statistics, MSc statistics, not an engineering student.

    • @himanshudaksh4957
      @himanshudaksh4957 4 ปีที่แล้ว

      @@mthetree Sorry for my statement. I didn't notice that.

    • @glokta1
      @glokta1 4 ปีที่แล้ว +6

      Eh, why? Because he had better profs? (not true btw) The biggest determiner of your understanding is how much you've self studied and ruminated on the topic yourself. No amount of teaching can give you that clarity

    • @gauravshukla9962
      @gauravshukla9962 3 ปีที่แล้ว +1

      He is a post graduate student ..and the real iitians who have passed Jee advanced and earned a bachelor degree from among the top 5 iits would be more knowledgeable than the two interviewer..

  • @adarshraj1467
    @adarshraj1467 3 ปีที่แล้ว +4

    The attitude of Sudhanshu isn't good, infact very mean towards the interviewee which could diminish his confidence. Pls don't laugh at them and say this like I ll give u 30secs

  • @saurabhjadhav6412
    @saurabhjadhav6412 4 ปีที่แล้ว +7

    Is this how they conduct data science interview ??

    • @saitrinathdubba
      @saitrinathdubba 4 ปีที่แล้ว +3

      Nope, not at all !! sudharshan passes statements just like that !! they dont make you suffocated

    • @parthtool
      @parthtool 4 ปีที่แล้ว

      it's possible, that the same type of people are present in Indian companies. AI engineers are everywhere, so these questions could be copied by all

  • @ann5563
    @ann5563 4 ปีที่แล้ว +5

    Guys pls stop degrading that guy he isn't some god he as much as human as us. Maybe he is scared and have some other issue god knows. Stop degrading
    Just because you failed once that doesn't mean you can't get up again. At least he gets a chance to improve. This is so stupid.

  • @kritikarya3204
    @kritikarya3204 3 ปีที่แล้ว

    z statistics we can call if population variance is known and t-statistics we call if p var is unknown.

  • @AniketSomwanshi-ll7mz
    @AniketSomwanshi-ll7mz 4 ปีที่แล้ว +27

    Why is the interviewer not ready to hear a word outside his dictionary? Lol it's like that teacher from 3 idiots. No offence tho

    • @siddhartharaja9413
      @siddhartharaja9413 4 ปีที่แล้ว +8

      Yeah,these are like that they only want to hear those predefined bookish answers,not anything new,because when they are taking interview ,they think themselves as the boss and thinks that they know everything

    • @ashutoshkarna2133
      @ashutoshkarna2133 4 ปีที่แล้ว +7

      It is shame that a Master student in statistics is being questioned by those who have no knowledge of statistics themselves.

    • @aritramj
      @aritramj 4 ปีที่แล้ว +3

      Absolutely. I was about to write this. Ratta marke ake hero ban raha. Asal mein zyada dimag hai nai.

    • @salonigandhi4807
      @salonigandhi4807 4 ปีที่แล้ว +5

      Im stoked at the fact that people think this is how interviews are conducted. Rather than encouraging the candidate and asking him probing questions they are undermining his thought process

  • @phanindraparashar8930
    @phanindraparashar8930 4 ปีที่แล้ว +2

    Focal Loss is also good for mis balanced data

  • @kojo5946
    @kojo5946 3 ปีที่แล้ว +2

    This guy is super brave!😂 although he wasn’t technical enough in his answers, he was courageous enough to do this publicly🤣. Good job fam!

  • @akshaysaxena7920
    @akshaysaxena7920 4 ปีที่แล้ว +2

    Main IIT ni ja paya.. Iss video ko dekhne se pehle mujhe bahot guilt tha. Thanks to this video, now I feel happy and I won't send my kids to IIT ever. 😜

    • @souviksarkar628
      @souviksarkar628 4 ปีที่แล้ว +3

      Teacher's in IIT'S are best teachers India Could produce but not everyone you can see many NPTEL courses and check their teaching techniques.
      Many teachers just read those PPT's and , the least amount of effort. The main problem is blindly accepting the fact " IIT's are absolute" (andh vakti), Even Major companies follow this.(I am not from Engineering background Physics student).
      Getting into IIT's is harder due to Reservation.

    • @devanshsharma3502
      @devanshsharma3502 4 ปีที่แล้ว

      Bro there is a huge difference between UG and PG students. You may think anything that helps you have a good sleep.

    • @sayanray8532
      @sayanray8532 3 ปีที่แล้ว

      @@devanshsharma3502 but this pgs will be earning much more than ugs in other colleges .. so it's an win situation I guess

  • @chandrasekharuddagiri6653
    @chandrasekharuddagiri6653 4 ปีที่แล้ว

    Pin code is a tuple made up of (LAT, LONG). lat and long are numerical data by themselves.

  • @dheerajsharma5492
    @dheerajsharma5492 4 ปีที่แล้ว +12

    I will never sit with u guys😃

  • @skyrayzor3693
    @skyrayzor3693 11 หลายเดือนก่อน

    As around time stamp 34:05
    Krish sir asked sahil about *Why do we select time series datapoint which are located in the end of the dataset as test data, as we can't split time series data like universal method :
    my answer : Because of data shift ? (if i am wrong please correct me )
    Thank you.

  • @nik7867
    @nik7867 4 ปีที่แล้ว +1

    Can we use features and classification using neural networks by inputting 1000 datapoints regarding those 10 features and processing it on train test CV and getting a proper model with it ,with one vs all output format

  • @ambrosearuwa9458
    @ambrosearuwa9458 4 ปีที่แล้ว +2

    Am looking for a machine learning/data science internship and don't know how to go about it. Please any tip will be helpful.
    And I will like to know your domain and how you chose the right domain for yourself

    • @Saurabhyadav-yb2oo
      @Saurabhyadav-yb2oo 3 ปีที่แล้ว

      Hi hope you are doing well and have done some internship. Actually I am looking for someone to practice and prepare data science interview+ improving communication. My background is in ECO and statistics. Currently I am pursuing MSc in quant eco.

  • @bhargavrm
    @bhargavrm 3 ปีที่แล้ว

    N-1 is to factor in the Degree of freedom where you are taking a sample and not the full population.
    Hope my point makes sense.

  • @kratnakrishna
    @kratnakrishna 4 ปีที่แล้ว

    Knowing of central limit theorem help us to know the spread of data easily because as normal distribution helps us to calculate pdf, cdf and we can conclude at what percentage of data lying under the limit. Correct me if i am wrong

    • @ashutoshkarna2133
      @ashutoshkarna2133 4 ปีที่แล้ว +1

      Yes. You are wrong. Central limit theorem allows you to expect almost types of distribution to asymptomatically follow normality. There is no relevance of data spread here.

    • @kratnakrishna
      @kratnakrishna 4 ปีที่แล้ว

      Thanks 👍 for this info

  • @vivekpathak03
    @vivekpathak03 3 ปีที่แล้ว +1

    You were good Sahil 🙂

  • @jatinyadav8960
    @jatinyadav8960 3 ปีที่แล้ว +1

    I think one should take one or two days to prepare before interview

  • @KiranKumar-eu2wu
    @KiranKumar-eu2wu 2 ปีที่แล้ว

    36:38 we do differencing to try to reduce those shocks in the series

  • @SkulLCruSHER1205
    @SkulLCruSHER1205 4 ปีที่แล้ว +2

    Can u share the resume also
    So that we can get to know the format for reference

  • @samakshkaushik3707
    @samakshkaushik3707 4 ปีที่แล้ว +9

    The interviewer just wanted to hear what he wants to hear.
    Nothing more nothing less

    • @subho2859
      @subho2859 4 ปีที่แล้ว

      Exactly...

    • @samakshkaushik3707
      @samakshkaushik3707 4 ปีที่แล้ว

      @@TAF3000 he is not even listening to.Nor he is giving him time to think. In real life, the interviews are never conducted in this way or may be I never faced such interviewer, not even at FANG

    • @abhishek007123
      @abhishek007123 4 ปีที่แล้ว +2

      Lol, not really. That's minimal expected from IIT statistician. They heard him.

  • @tripathi5174
    @tripathi5174 4 ปีที่แล้ว +2

    11:34 What's with the attitude man ? we know you are good but sir give the guy the time he deserves. sudhanshu

    • @parthtool
      @parthtool 4 ปีที่แล้ว +1

      exactly, That man Sudhanshu is so triggered, he gets agitated in every sec of the interview. Crass and egotist attitude. No patience at all.

    • @tripathi5174
      @tripathi5174 4 ปีที่แล้ว

      @@parthtool you are totally correct.

  • @joeljoy5164
    @joeljoy5164 4 ปีที่แล้ว +1

    Logistic uses binary cross entropy as loss function

  • @jvkk123
    @jvkk123 4 ปีที่แล้ว

    These guys are gift for all aspiring DS!

  • @pyclassy
    @pyclassy 4 ปีที่แล้ว +7

    i watched this till 31.36 sec, thanks god this boy didn't said his fav algoritham from deep learning, otherwise sudhanshu gonna r*p* him.
    but this session will definitely help this boy.

  • @tanayasharma7776
    @tanayasharma7776 4 ปีที่แล้ว +2

    That's ranvijay from roadies 😭

  • @pritam1366
    @pritam1366 4 ปีที่แล้ว +3

    Sudhansu with op questions man.. shit even if I can explain he will find another question from tht obvio boss level..

  • @rogerthat1515
    @rogerthat1515 4 ปีที่แล้ว +2

    This guy (Sudhanshu Kumar) is incapable of any job in India, if anyone can find, please let me know how many times did he say "LIKE A". !

  • @rahulkrishna1657
    @rahulkrishna1657 4 ปีที่แล้ว +13

    Just dominating doesn't makes you smart - is this the way interview is done lol .

  • @DnyaneshwarPanchaldsp
    @DnyaneshwarPanchaldsp 3 ปีที่แล้ว

    Nice to see question answer 💐💐