K Nearest Neighbour Easily Explained with Implementation

แชร์
ฝัง
  • เผยแพร่เมื่อ 2 ธ.ค. 2024

ความคิดเห็น • 94

  • @sahilvlogs5848
    @sahilvlogs5848 2 ปีที่แล้ว +1

    grt explanation my teacher took 2 days i didnt undersatand a word by watching this 18 min main done with knn thnku

  • @thomsondcruz
    @thomsondcruz 2 ปีที่แล้ว +1

    Excellent Video! 3:41 Euclidean Distance is nothing but Pythagoras theorem's way of calculating the hypotenuse

  • @naveendubey2815
    @naveendubey2815 4 ปีที่แล้ว +1

    Excellent Krish...you are really giving a lot to the society..

  • @kamran_desu
    @kamran_desu 4 ปีที่แล้ว +9

    Great explanation, just adding my thoughts here.
    @12:20, you've mentioned K=1 is underfitting. I think it's the other way around.
    Low K means highly flexible and jagged boundaries (low bias high variance) leading to overfitting.

    • @eduardomedina5081
      @eduardomedina5081 3 ปีที่แล้ว +1

      Good point

    • @KeigoEdits
      @KeigoEdits 2 ปีที่แล้ว

      Hey kamran do you got the point why he used 23 as k and why not 33 as it is giving the highest accuracy. Yeah i get the point of overfitting maybe thats why we didn't chose 33 but why 23 either..

  • @87040256
    @87040256 2 ปีที่แล้ว +1

    Thank you so much! This is exactly what I need it.

  • @VC-dm7jp
    @VC-dm7jp 3 ปีที่แล้ว +2

    Thank you so much for explaning the concept and code in such a friendly manner.

  • @sharmakartikeya
    @sharmakartikeya 3 ปีที่แล้ว +5

    Thank you sir, KNN is pretty clear to me now !! : )

  • @Neerajkumar-xl9kx
    @Neerajkumar-xl9kx 3 ปีที่แล้ว

    great way of teaching by putting code and implementation

  • @AdityaRaj-kl1be
    @AdityaRaj-kl1be 4 ปีที่แล้ว +13

    In this video, you told that your model will underfitting when k=1, but in this case model always go to overfitting when k is low
    but we increase the k then our model goes to underfitting .

    • @saisai-yo4nv
      @saisai-yo4nv 4 ปีที่แล้ว +2

      yeah i have the same doubt k=1 it will be overfitting and k=n it will be underfitting

    • @Yzyou11
      @Yzyou11 2 ปีที่แล้ว

      Yes

  • @tagoreji2143
    @tagoreji2143 2 ปีที่แล้ว

    this is what is needed, thank you so much sir

  • @nileshkulkarni2845
    @nileshkulkarni2845 3 ปีที่แล้ว

    Very well explained sir. ..... Thanks a lot for making my concept clear

  • @sandipansarkar9211
    @sandipansarkar9211 4 ปีที่แล้ว +7

    Superb explanation. Now just need to make my hands dirty in the Jupyter notebook.Thanks.

  • @sairohithpasham
    @sairohithpasham 3 ปีที่แล้ว

    Thanks for giving a lucid explanation.

  • @mallikharjunv6805
    @mallikharjunv6805 3 ปีที่แล้ว

    Thanks Krish. Good explanation..!

  • @studio2038
    @studio2038 3 ปีที่แล้ว +1

    🙏nice video easy to understand

  • @gauravkumar2602
    @gauravkumar2602 3 ปีที่แล้ว

    Amazingly explained.Thanks a lot

  • @sandipansarkar9211
    @sandipansarkar9211 4 ปีที่แล้ว

    Finished practicing in Jupyter notebook.Thanks

  • @manikaransingh3234
    @manikaransingh3234 4 ปีที่แล้ว +4

    I don't understand the idea of using KNN for a regression problem. For classification, it's fine: - There you know the location of the point (x and y value) and you have to predict it's category. picking up the five nearest points is understandable.
    But in a regression problem, you only know the x value of a point and you have to predict the Y value, If I'm not wrong here. In the video, you first plot the point and then pick 5 or some nearest points. But if you already know the location (x,y) of the point, what is the problem here? The mean of 5 neighbors distances gives you what? I'm guessing the Y value but if that is so then how will you pick k neighbors.
    Please Answer!

    • @krishnaik06
      @krishnaik06  4 ปีที่แล้ว

      For Knn regressorr u take the average of 5 nearest neighbour

    • @manikaransingh3234
      @manikaransingh3234 4 ปีที่แล้ว

      @@krishnaik06 I'm really sorry sir. But that doesn't answer my question.
      I understand you're busy and maybe couldn't go through the whole question.
      Please try to look at it once more and reply whenever you have time.
      Thanks!

    • @sathishs1756
      @sathishs1756 4 ปีที่แล้ว

      @@manikaransingh3234 Iam not sure exactly but according to lecture we should always select the k value as 5 the mean value 5 nearest neighbour value gives y value.

    • @manikaransingh3234
      @manikaransingh3234 4 ปีที่แล้ว

      @@sathishs1756 you too didn't understand my question.
      Okay,
      You say five neighbors, neighbors of which point?

    • @himabinduh7623
      @himabinduh7623 4 ปีที่แล้ว +1

      @manikaranasingh Same doubt here

  • @karalworld
    @karalworld 5 ปีที่แล้ว +1

    Excellent work.. Done a good job.

  • @raghuram6382
    @raghuram6382 3 ปีที่แล้ว +1

    @Krish Naik The dataset you explained here is a Regression problem right? then why have you used "KNearestClassifier" in the codes while importing from sklearn library? could you please tell me? Also why classification report is needed for a regression problem here?

  • @kvsaipratap7697
    @kvsaipratap7697 5 ปีที่แล้ว +4

    HI Sir thank you very much for your transfer of knowledge Can Please explain about concept of Weight of Evidence(WOE) and how it is used in classification algorthims

  • @manjunath.c2944
    @manjunath.c2944 5 ปีที่แล้ว +1

    superb ..good job very much appreciated

  • @chetanmundhe8619
    @chetanmundhe8619 4 ปีที่แล้ว

    Very nice explaination, thank u for this video

  • @istech21
    @istech21 3 ปีที่แล้ว +1

    You did not mentioned which metrics is applied when test. Eucledian, Manhattan? sklearn library seems to be use minkowski by default.

  • @kushalhu7189
    @kushalhu7189 3 ปีที่แล้ว

    Perfectly explained

  • @chaitanyasrinevas8764
    @chaitanyasrinevas8764 3 ปีที่แล้ว +2

    In the error rate vs value of the K plot, shouldn't the value of K be around 37? At k=37, we are getting the least error. At this point, the error is less than 0.6?

  • @subbareddyjangalapalli4708
    @subbareddyjangalapalli4708 4 ปีที่แล้ว +2

    Than you Krish, can we call all multiclass logit regressions are non-linear? please confirm or post small video. Thank you

  • @shlokdoshi7162
    @shlokdoshi7162 2 ปีที่แล้ว

    If a give an input list for the KNN algorithm to predict the classes of each element, How can I print out the list of inputs only belonging to a particular class?

  • @louerleseigneur4532
    @louerleseigneur4532 3 ปีที่แล้ว

    Thanks Krish

  • @Subliminal001
    @Subliminal001 2 ปีที่แล้ว +1

    I didn't get why you took k=23 as in the accuracy plot, we can see that the accuracy is increasing after that point. We should take k value so as to maximize the accuracy, right?

    • @KeigoEdits
      @KeigoEdits 2 ปีที่แล้ว

      Same with me, did you got the point now?

  • @pavankumargopidesu4730
    @pavankumargopidesu4730 5 ปีที่แล้ว +1

    hi krish in what situations we can use KNN and logistic regression and what is the difference between them.

  • @hasnainalibohra8232
    @hasnainalibohra8232 2 ปีที่แล้ว +1

    Hello Sir for k=1 im getting overfitting data and as i increase the value of k the error rate is increasing. How to choose k value if the error graph is linear

  • @RaviSharma-tg6yx
    @RaviSharma-tg6yx 3 ปีที่แล้ว

    It this is also necessary to standardize the categorical variable in KNN to find the better K-value?

  • @srinathakarur9798
    @srinathakarur9798 4 ปีที่แล้ว

    Thank u sir 4 ur logic.

  • @roopagaur8834
    @roopagaur8834 5 ปีที่แล้ว

    Thank you so much ....!!! It's really nice explanation.

  • @shreyanshdubey8530
    @shreyanshdubey8530 4 ปีที่แล้ว +1

    Instead of standard scalar can't we use MinMaxScalar?

  • @azharshaik21
    @azharshaik21 5 ปีที่แล้ว +1

    Could you please briefly explain about euclidean and Manhattan distance

  • @social_media789
    @social_media789 ปีที่แล้ว

    how to find radius in knn ( in jupyter notebook with code )

  • @HARSHRAJ-2023
    @HARSHRAJ-2023 5 ปีที่แล้ว

    Hi Kris. Can you please share the link of video on imbalance dataset.

  • @howdontanalytics6158
    @howdontanalytics6158 3 ปีที่แล้ว

    Can you tell me how I can choose variables for KNN? I have 20+ variables, and not sure how I would keep some of the variables with what criteria.

  • @SpiritedTravellerr
    @SpiritedTravellerr 3 ปีที่แล้ว

    sir I have gone through ML playlist and some videos are not according to step by step after 50 th video so can you check it again please. bcz some videos are interchange up down

  • @siddheshpawar1441
    @siddheshpawar1441 5 ปีที่แล้ว +1

    thank you sir
    great explanation
    sir can you make one video on yolo algorithm?

  • @helenhilamariam3149
    @helenhilamariam3149 4 ปีที่แล้ว

    Hello sir would you please explain about Nearest Neighbour Algorithms
    for Forecasting Call Arrivals in Call Centers article

  • @aashishdagar3307
    @aashishdagar3307 3 ปีที่แล้ว +2

    Hi, Krish why not use k =33 it has min error and max accuracy instead of 23?

    • @mahindrarao4565
      @mahindrarao4565 3 ปีที่แล้ว

      It leads you to Overfitting. Too less training error is also not acceptable.

    • @aashishdagar3307
      @aashishdagar3307 3 ปีที่แล้ว

      I think we need to plot error rate for train vs CV then we have a better plot to look at,and decided to choose 23/33.
      If the gap between train and cv is less k=33 then k=23 then use k=23 ,otherwise 23 is good.

  • @ashishgarg5186
    @ashishgarg5186 3 ปีที่แล้ว

    How does outlier effect knn??

  • @eugeneliu1212
    @eugeneliu1212 3 ปีที่แล้ว

    If K is 4, and there are 2 2 equal distribution, what would be the classification?

    • @codermafia3441
      @codermafia3441 2 ปีที่แล้ว

      No k value will be always odd.

  • @Charmingenby
    @Charmingenby 4 ปีที่แล้ว

    Hi Krish did u find error.rate (1-mean) becz u standardised the data points forehand; that part confuses me

  • @ablearing4927
    @ablearing4927 4 ปีที่แล้ว

    Hi Krish, I am trying to learn about algorithms which can be used for text base analysis. Could you please advise?

  • @shreyanshsahay
    @shreyanshsahay 4 ปีที่แล้ว

    Hi Krish Why we didnt take sqrt of datapoints to calculate K?

  • @unezkazi4349
    @unezkazi4349 3 ปีที่แล้ว

    How does it train itself on the data?

  • @surendratadakaluru8900
    @surendratadakaluru8900 5 ปีที่แล้ว

    Why we take k=5,
    We can take any other value or not

  • @sindhumathi9209
    @sindhumathi9209 3 ปีที่แล้ว

    I have started learning about Data Modelling and ML. My doubt is K-Nearest Neighbour will come under classification algorithm which is type of supervised learning. But here it is explained with regression also. Can anyone help me to understand!

    • @codermafia3441
      @codermafia3441 2 ปีที่แล้ว

      It works for both classification as well as regression problem. And it comes under supervised machine learning.But in real data scinario KNN mostly used for classification problems.

  • @abhiyujaiswal7579
    @abhiyujaiswal7579 5 ปีที่แล้ว

    Impressive ! Nice Clarification..

  • @pabitrakumarghorai7623
    @pabitrakumarghorai7623 4 ปีที่แล้ว +2

    I do not understand why you take k nearest neighbors as 23?
    pls reply me sir...

    • @lalitchaudhari7470
      @lalitchaudhari7470 4 ปีที่แล้ว

      Refer following video buddy,
      th-cam.com/video/otolSnbanQk/w-d-xo.html

    • @manikantamaka7910
      @manikantamaka7910 4 ปีที่แล้ว

      we are choosing k by seeing the graph
      , on x axis "k", y axis error rate. so like that k is 23

  • @dragolov
    @dragolov 3 ปีที่แล้ว

    These are 2 musical (jazz) solos generated using K Nearest Neighbor classifier:
    th-cam.com/video/zt3oZ1U5ADo/w-d-xo.html
    th-cam.com/video/Shetz_3KWks/w-d-xo.html

  • @xinyuanliu1959
    @xinyuanliu1959 4 ปีที่แล้ว

    I don't understand why to choose k=5 while later in the video it chooses 23?

    • @manikaransingh3234
      @manikaransingh3234 4 ปีที่แล้ว

      choosing k=5 is just for reference it's like just another example. You have to pick the best value of k for which the final error is minimum. The value of k will basically depend on the dataset points.

  • @sajidurrehman89
    @sajidurrehman89 4 ปีที่แล้ว +1

    why we need training if we just calculate distance from points in testing ? What exactly is done in training phase if we just classify points based on distance ?

    • @adipurnomo5683
      @adipurnomo5683 3 ปีที่แล้ว

      KNN does not have training phase.

    • @QasimKhan-nd8og
      @QasimKhan-nd8og 3 ปีที่แล้ว

      Internally, KNN uses a tree data structure to sort feature vectors so that it does not have to search the entire training set for finding nearest neighbors. This data structure is generated during training

  • @iftikhar3609
    @iftikhar3609 3 ปีที่แล้ว

    Sir DO you have any discord or slack community if yes please share it here i would like to join your community.

  • @kavyasharma5540
    @kavyasharma5540 4 ปีที่แล้ว

    Where is the link to this kaggle code

  • @shubhamsahu943
    @shubhamsahu943 4 ปีที่แล้ว

    sir what is the name of this data set on kaggle

  • @anandprasadcc0967
    @anandprasadcc0967 4 ปีที่แล้ว

    Thankyou sir :)

  • @awesomeak7083
    @awesomeak7083 4 ปีที่แล้ว

    Great

  • @madhabipatra8973
    @madhabipatra8973 3 ปีที่แล้ว

    PLEASE HELP ME TO FIND OUT ML TUTORIAL -44

  • @yijunshen9287
    @yijunshen9287 3 ปีที่แล้ว

    best comparing other resources !!!!!!!!!!!

  • @shreeshanayak187
    @shreeshanayak187 5 หลายเดือนก่อน

    Can u please provide ppt

  • @arunkumarr6660
    @arunkumarr6660 5 ปีที่แล้ว

    could see kadhal vandhale sonf from your bookmarks !!! hah hah ...nice song though

  • @atifroome
    @atifroome 3 ปีที่แล้ว

    Hue = hoie 😀

  • @solomongift951
    @solomongift951 4 ปีที่แล้ว +1

    Wa.kn was.pkw

  • @sarthakbhatnagar961
    @sarthakbhatnagar961 4 ปีที่แล้ว

    go corona corona go!!

  • @louerleseigneur4532
    @louerleseigneur4532 3 ปีที่แล้ว

    Thanks Krish