Machine Learning Tutorial Python - 18: K nearest neighbors classification with python code

แชร์
ฝัง
  • เผยแพร่เมื่อ 27 มิ.ย. 2024
  • In this video we will understand how K nearest neighbors algorithm work. Then write python code using sklearn library to build a knn (K nearest neighbors) model. The end, I have an exercise for you to practice concepts you learnt in this video.
    Code: github.com/codebasics/py/blob...
    Exercise: github.com/codebasics/py/blob...
    ⭐️ Timestamps ⭐️
    00:00 Theory
    03:51 Coding
    14:09 Exercise
    Do you want to learn technology from me? Check codebasics.io/?... for my affordable video courses.
    Machine learning tutorial playlist for beginners: • Machine Learning Tutor...
    🌎 My Website For Video Courses: codebasics.io/?...
    Need help building software or data analytics and AI solutions? My company www.atliq.com/ can help. Click on the Contact button on that website.
    🎥 Codebasics Hindi channel: / @codebasicshindi
    #️⃣ Social Media #️⃣
    🔗 Discord: / discord
    📸 Dhaval's Personal Instagram: / dhavalsays
    📸 Instagram: / codebasicshub
    🔊 Facebook: / codebasicshub
    📱 Twitter: / codebasicshub
    📝 Linkedin (Personal): / dhavalsays
    📝 Linkedin (Codebasics): / codebasics
    ❗❗ DISCLAIMER: All opinions expressed in this video are of my own and not that of my employers'.

ความคิดเห็น • 90

  • @codebasics
    @codebasics  2 ปีที่แล้ว +3

    Do you want to learn technology from me? Check codebasics.io/ for my affordable video courses.

  • @flavio4923
    @flavio4923 2 ปีที่แล้ว +43

    just a tip I read from a book: for highly structured data a smaller K is better (like this example, or handwritting/ speech recognition), but for noisy data it is recommended using a bigger K.
    keep up the great videos!

    • @adwaimohan3728
      @adwaimohan3728 หลายเดือนก่อน

      can u lmk the book ur refering to

  • @PythonArms
    @PythonArms 2 ปีที่แล้ว +16

    When you said most important skill then said ctrl C/Ctrl V I lost it. haha. great video

  • @Koome777
    @Koome777 6 หลายเดือนก่อน +4

    I got a score of 0.99444 with k=6 while using random state to test the outcomes of each change in K. I've also discovered that sklearn has a module for displaying confusion matrix without using seaborn or matplotlib. The module is called ConfusionMatrixDisplay and it only takes the confusion matrix object as its parameter. Thanks Dhaval Patel sir.

  • @gusinthecloud
    @gusinthecloud 2 ปีที่แล้ว +3

    The best teacher makes simple the difficult subject. Thank you. You are great!!!

  • @zizoublbs8918
    @zizoublbs8918 2 ปีที่แล้ว +8

    iv tested the classification with KNN in the digits dataset and had an accuracy of 99.44% with n_neighbours=3 and test_size=0.2 of the split (i never see solutions) thank you for your tutorials its extremely useful (y)

  • @vimalradadiya5929
    @vimalradadiya5929 2 ปีที่แล้ว +5

    Nice explanation and sir please continue this playlist it's really helpful to gain knowledge regarding Machine learning.
    Thank you so much sir

  • @user-bo2eg8dq9c
    @user-bo2eg8dq9c 2 ปีที่แล้ว +7

    "How to become data scientist/ML engineer"
    Views = 500K
    "Tutorial on ML/DS topics"
    Views = 5-10K
    sums up how much effort everyone is putting to be a ML/DS guy🌝

    • @codebasics
      @codebasics  2 ปีที่แล้ว

      Ha ha, this is so true 😊

  • @ghzich017
    @ghzich017 2 ปีที่แล้ว +3

    7:39 Most relatable statement I've heard so far

  • @jiyabyju565
    @jiyabyju565 2 ปีที่แล้ว

    these lectures tempt me to search for next upcoming videos...thank you for all these effort...

  • @FindingInsights
    @FindingInsights 2 ปีที่แล้ว

    What an amazing and, easy to learn video. Thank you.

  • @marcellodichiera
    @marcellodichiera 2 ปีที่แล้ว +1

    Going back to basics Boss!?! You are amazing !!

  • @A0G7
    @A0G7 ปีที่แล้ว

    The most important skill that you have is to copy the amazing knowledge you have and past it smoothly to our understanding. That is what I called mastering ctrl c and ctrl v

  • @tianhuicao3297
    @tianhuicao3297 ปีที่แล้ว

    Thank you so much, Dhaval! I'm watching your videos to survive my DS classes.

  • @alcryton6515
    @alcryton6515 2 ปีที่แล้ว +3

    Even my training institute is teaching from your codes only.
    You are really a great teacher

  • @60pluscrazy
    @60pluscrazy 2 ปีที่แล้ว +5

    Really an outlier in simplification. Awesome 🙏

  • @pouriab9782
    @pouriab9782 2 ปีที่แล้ว +1

    I tested n_neighbors between 1 to 100 using cross validation and plotted the results. looks like as you increase number of neighbors, the score declines (the cool thing is it's linear)
    highest score was at n_neighbors = 3 with test_size = 0.33 and it was 0.9915.
    p.s: I have watched all the videos from this series and I've got to say you're amazing sir. keep making tutorials cause you're the best!

    • @zizoublbs8918
      @zizoublbs8918 2 ปีที่แล้ว +1

      try test_size=0.2 i had 0.9944 😀

  • @mbogitechconpts
    @mbogitechconpts 2 ปีที่แล้ว

    First time here and I just have to subcribe. Very funny but good teacher. God bless you.

  • @user-sy4kk3oh1o
    @user-sy4kk3oh1o หลายเดือนก่อน

    Very easy and meaningful explanation, Thank You Sir

  • @amandaahringer7466
    @amandaahringer7466 2 ปีที่แล้ว

    Awesome explanation!!

  • @nazmulhaqueomi8121
    @nazmulhaqueomi8121 7 หลายเดือนก่อน

    Your last comment is very sir.
    Your are an amazing teacher sir.
    Thanks a lot

  • @shylashreedev2685
    @shylashreedev2685 2 ปีที่แล้ว +3

    Exercise Result 99.16% Score with k=4, Thank u so much sir, i m trying to solve all ur exercises and it is helping me build my confidence in ML

  • @charmilam920
    @charmilam920 2 ปีที่แล้ว +2

    Amazing
    Please do it for other algorithms also

  • @alidakhil3554
    @alidakhil3554 2 ปีที่แล้ว

    You are excellent!!!!

  • @VishnuPriya-tz4ls
    @VishnuPriya-tz4ls หลายเดือนก่อน

    you're the best.Thank you

  • @datastako156
    @datastako156 2 ปีที่แล้ว +4

    im gonna use my most important skill "Ctrl-C Ctrl-V" hahaha.. that funny sir

  • @sophiamary2522
    @sophiamary2522 2 ปีที่แล้ว

    Very useful video, that's a lot

  • @paulkornreich9806
    @paulkornreich9806 2 ปีที่แล้ว +1

    Plotted scores between 1 and 50 and found highest accuracy at k = 7 or k = 8 at 99.72% accuracy. Like everyone else, it sharply declined after that. Used the same parameters for data split as in the video.

  • @anjalinair1763
    @anjalinair1763 11 หลายเดือนก่อน

    Really informative sir

  • @ramandeepbains862
    @ramandeepbains862 ปีที่แล้ว

    for k =1 overfit issue for k=3 I got the best score of 0.985397 for the exercise digits dataset . as compared to the SVM, KNN gave the best accuracy for digits dataset

  • @samrozch8419
    @samrozch8419 9 หลายเดือนก่อน

    nice lecture sir

  • @bestcomedyjokes4913
    @bestcomedyjokes4913 ปีที่แล้ว

    pretty good tutorial for free👍

  • @saravanashanmuganathan4692
    @saravanashanmuganathan4692 2 ปีที่แล้ว

    Thank u sir

  • @slainiae
    @slainiae 3 หลายเดือนก่อน

    Perfect score with n_neightbors = 6.

  • @SimranUppal6991
    @SimranUppal6991 7 หลายเดือนก่อน

    Your computer will start sneezing and it will have a fever 🤣🤣 Amazing content ❤💯

  • @carolinemoraes3704
    @carolinemoraes3704 11 หลายเดือนก่อน

    Thankss!

  • @narendraparmar1631
    @narendraparmar1631 7 หลายเดือนก่อน

    Thanks😀

  • @shubhamchauhan6916
    @shubhamchauhan6916 2 ปีที่แล้ว

    Sir can you explain this code to replace nan values with knn and dataset have both categorical and continuous datapoint

  • @shlokdoshi7162
    @shlokdoshi7162 2 ปีที่แล้ว

    If a give an input list for the KNN algorithm to predict the classes of each element, How can I print out the list of inputs only belonging to a particular class?

  • @yamrajoli3834
    @yamrajoli3834 2 ปีที่แล้ว

    hello sir, while making the heatmap using the seaborn during making label I think at x -axis there should true and in y label predicted but you had done exactly opposite
    is it like as you done ore actually there is mistakes I have confusion in confusion matrix interpretation.
    please reply me sir

  • @rajatsharma7899
    @rajatsharma7899 2 ปีที่แล้ว +2

    Sir , i have a background of Geophysics and i want to do data science in canada. So, is it possible to connect data sciences with Geophysics

  • @ArulPasupathi
    @ArulPasupathi หลายเดือนก่อน

    Most important skill cntrol c and control v hence prooved in this video,,...😝

  • @sai_sh
    @sai_sh ปีที่แล้ว

    Can we use SVM here 6:45 . since it can be easily separated using hyperplane

  • @bhaskarg8438
    @bhaskarg8438 2 ปีที่แล้ว +2

    I have one doubt, in Confusion_Matrix(Truth,Predicted) , but in plt graph we are giving in reverse... like x-axis as Predicted and y-axis as Truth....
    Can you please clarify , Thank you 🙏

    • @gouravsapra8668
      @gouravsapra8668 2 ปีที่แล้ว

      I dont think it matters....you can do either way...You may try yourself..

  • @ogochukwustanleyikegbo2420
    @ogochukwustanleyikegbo2420 10 หลายเดือนก่อน

    I got an accuracy of 96.38 with K = 4 after working on the exercise

  • @pheiroijamprishika6414
    @pheiroijamprishika6414 2 ปีที่แล้ว

    Sir can you post about unsupervised learning, about Boltzmann machine and it's types
    i.restricted Boltzmann machine
    ii. Deep Boltzmann machine
    iii. Deep belief network

  • @prathampandey9898
    @prathampandey9898 2 ปีที่แล้ว

    Can you create videos for Reinforced Learning?

  • @shifaabid1425
    @shifaabid1425 2 ปีที่แล้ว +1

    most Important skill
    Ctrl + C and Ctrl + V....

  • @haleykwok2501
    @haleykwok2501 2 ปีที่แล้ว

    😂sir you are humorous!

  • @vishnuviswanathmm
    @vishnuviswanathmm 2 ปีที่แล้ว +1

    Is it necessary to split the dataset into Training and Testing set for KNN? Since KNN being a lazy algorithm

    • @ShawnDypxz
      @ShawnDypxz 2 ปีที่แล้ว +2

      He is just doing it for testing the model. So it's like he only had data equal to training data. Then he used testing data as foreign data to figure out where those foreign data lie in the clusters.

    • @vishnuviswanathmm
      @vishnuviswanathmm 2 ปีที่แล้ว

      @@ShawnDypxz Makes sense. Thank you

  • @eng.mariamalhussainy687
    @eng.mariamalhussainy687 2 ปีที่แล้ว

    Hi Sir , Thanks Thanks Thanks for this explain and You are great man in your explain thanks alooooot,, Excuse me Sir is the test=30% and train =70%?

    • @uchenwodo4603
      @uchenwodo4603 2 ปีที่แล้ว

      yes it is.....but you can change it to 80:20 if you wish to

  • @mikettu
    @mikettu 2 ปีที่แล้ว

    great explanation!!! using gridsearchcv method, K=2 was the best value. Continue the great job

  • @talharauf3111
    @talharauf3111 2 ปีที่แล้ว

    🤩

  • @Survivor-xs9gv
    @Survivor-xs9gv 2 ปีที่แล้ว

    I was following this tutorial series but unfortunately it is missing some topics. Any idea how many topics are left?

    • @codebasics
      @codebasics  2 ปีที่แล้ว +4

      yes I need to cover XGBoost, adboost, bagging, boosting, PCA. But majority of the topics are covered already in the series.

  • @EK-wq7qt
    @EK-wq7qt 2 ปีที่แล้ว

    *Hi, I have already added this comment to Perosonal Finance video but adding again to get an answer*
    *NEED HELP*
    I am currently doing *Personal Finance* project but I am getting error and despite dsearching a lot I couldnot resolve it, While transforming data initially, when I change date type from text to (Date) type, it only shows 2021 at the end of every date, like this 1/18/2021. Even before it was data of 2019 like jan-19 but after only shows this. Please help me in this, I am stuck here for a long time.

  • @praba8478
    @praba8478 2 ปีที่แล้ว +1

    Sir it is mandatory to learn excel and statistics for data analyst jobs?

  • @praveendeena1493
    @praveendeena1493 2 ปีที่แล้ว

    Hi sir ,I want your videos on Rasa chatbot and sentiment analysis

    • @shobhitbishop
      @shobhitbishop 2 ปีที่แล้ว

      I can help you, worked on building RASA based models as well

    • @praveendeena1493
      @praveendeena1493 2 ปีที่แล้ว

      @@shobhitbishop thank you sir.

  • @orekisato9145
    @orekisato9145 7 หลายเดือนก่อน

    Please tell me how to make Knn based error checking on language in step by step or please share any link ??

  • @dataoverview7388
    @dataoverview7388 2 ปีที่แล้ว

    Hi, How many ways to find out K value. And " Elbow" method it's useful or not in finding K value, if suppose not why?

  • @nihalwaghmare1337
    @nihalwaghmare1337 2 ปีที่แล้ว +1

    Sir I'm a fresher chemical engineer, can I build career in data analytics? I'm so confused . Plz help

    • @suprithhemanthkumar9166
      @suprithhemanthkumar9166 2 ปีที่แล้ว

      yes you can

    • @MrKhan-gb3rc
      @MrKhan-gb3rc 7 หลายเดือนก่อน

      how to develop the model using KNN ?@@suprithhemanthkumar9166

  • @Koalasq119
    @Koalasq119 ปีที่แล้ว

    Sir, I would die for you. I am paying thousands of dollars for some guy to not teach me a goddamn thing.

  • @tinkhanimphasa91
    @tinkhanimphasa91 ปีที่แล้ว +1

    😂😂😂I like the joke at the end of the video, i have tried it and my below is my classification report:
    precision recall f1-score support
    0 1.00 1.00 1.00 26
    1 0.89 1.00 0.94 50
    2 1.00 1.00 1.00 38
    3 1.00 0.93 0.96 28
    4 1.00 0.96 0.98 28
    5 0.96 1.00 0.98 43
    6 1.00 1.00 1.00 32
    7 0.95 1.00 0.98 42
    8 1.00 0.88 0.93 40
    9 1.00 0.94 0.97 33
    accuracy 0.97 360
    macro avg 0.98 0.97 0.97 360
    weighted avg 0.97 0.97 0.97 360
    and my cm is:
    array([[26, 0, 0, 0, 0, 0, 0, 0, 0, 0],
    [ 0, 50, 0, 0, 0, 0, 0, 0, 0, 0],
    [ 0, 0, 38, 0, 0, 0, 0, 0, 0, 0],
    [ 0, 0, 0, 26, 0, 1, 0, 1, 0, 0],
    [ 0, 0, 0, 0, 27, 0, 0, 1, 0, 0],
    [ 0, 0, 0, 0, 0, 43, 0, 0, 0, 0],
    [ 0, 0, 0, 0, 0, 0, 32, 0, 0, 0],
    [ 0, 0, 0, 0, 0, 0, 0, 42, 0, 0],
    [ 0, 5, 0, 0, 0, 0, 0, 0, 35, 0],
    [ 0, 1, 0, 0, 0, 1, 0, 0, 0, 31]], dtype=int64)

  • @themoneymaker03
    @themoneymaker03 2 ปีที่แล้ว

    Dang I caught the virus lol j/k. Thanks great video! 👍

  • @JMS-ht3td
    @JMS-ht3td ปีที่แล้ว

    now my computer has a fever

  • @markvincentgallemit9894
    @markvincentgallemit9894 ปีที่แล้ว

    test_size = 0.2
    k = 7
    score: 0.9972222222222222

  • @brandonsager223
    @brandonsager223 2 ปีที่แล้ว

    My computer got the virus, he wasn't lying

  • @rayhansuryatama909
    @rayhansuryatama909 2 ปีที่แล้ว

    ayo why is marc specter teaching artificial intelligence?

    • @codebasics
      @codebasics  2 ปีที่แล้ว

      It is my alter ego teaching ML 😎

  • @elahehgorgin9769
    @elahehgorgin9769 2 ปีที่แล้ว

    Why people keep calling you sir?

    • @zainnaveed267
      @zainnaveed267 2 ปีที่แล้ว

      how this question it is even related to ML

  • @curiousMan69
    @curiousMan69 3 หลายเดือนก่อน

    lol virus 😂

  • @princedoshi1594
    @princedoshi1594 ปีที่แล้ว

    i think sir you need to work on how to speak. It seems you are pretty confused yourself