Machine Learning Tutorial Python - 6: Dummy Variables & One Hot Encoding

แชร์
ฝัง
  • เผยแพร่เมื่อ 4 ม.ค. 2025

ความคิดเห็น • 678

  • @codebasics
    @codebasics  2 ปีที่แล้ว +13

    Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced

  • @celestineokpataku
    @celestineokpataku 4 ปีที่แล้ว +56

    I have watched only 4 mins so far i had to pulse and write this comment. I will say this is one of the best tutorial i have seen in data science. Sir you need to take this to another level. What a great teacher you are

    • @codebasics
      @codebasics  4 ปีที่แล้ว +5

      That for the feedback my friend 😊👍

    • @chitz7435
      @chitz7435 3 หลายเดือนก่อน +1

      100% aligned...am doing an external course but have to refer to ur session to understand the topic in external course...amazing effort..

  • @TheSignatureGuy
    @TheSignatureGuy 4 ปีที่แล้ว +60

    For anyone stuck with the categorical features error.
    from sklearn.compose import ColumnTransformer
    ct = ColumnTransformer([("town", OneHotEncoder(), [0])], remainder = 'passthrough')
    X = ct.fit_transform(X)
    X
    Then you should be able to continue the tutorial without further issue.

    • @muhammadhattahakimkeren
      @muhammadhattahakimkeren 4 ปีที่แล้ว +1

      thanks bro

    • @fatimahazzahra6181
      @fatimahazzahra6181 4 ปีที่แล้ว

      thanks a lot! it helps

    • @souvikdas3189
      @souvikdas3189 ปีที่แล้ว +1

      Thank you brother.

    • @Ran_dommmm
      @Ran_dommmm ปีที่แล้ว +1

      Hey, thank for the code.
      I tried using your code but it gives me an error, despite of converting it (X) to an array, it gives me this error.
      " TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.
      "

    • @TheSignatureGuy
      @TheSignatureGuy ปีที่แล้ว

      ​@@Ran_dommmm I know you said "despite converting X to an array", but just double check you have used the .toarray() method correctly. The error message seems pretty clear on this one.
      This function may help confirm that a dense numpy array is being passed.
      import numpy as np
      import scipy.sparse
      def is_dense(matrix):
      return isinstance(matrix, np.ndarray)
      Pass in X for matrix and it should return True.
      Good luck fixing this.

  • @codebasics
    @codebasics  5 ปีที่แล้ว +15

    Exercise solution: github.com/codebasics/py/blob/master/ML/5_one_hot_encoding/Exercise/exercise_one_hot_encoding.ipynb
    Everyone, the error with catergorical_features is fixed. Check the new notebook on my github (link in video description). Thanks Kush Verma for giving me pull request for the fix.

    • @urveshdave1861
      @urveshdave1861 5 ปีที่แล้ว

      Thank you for the wonderful explanation sir. However I am getting an error as __init__() got an unexpected keyword argument 'catergorical_features' for the line for my code onehotencoder = OneHotEncoder(catergorical_features = [0]). Is it because of change of versions?
      what is the solution to this?

    • @bishwarupdey10
      @bishwarupdey10 4 ปีที่แล้ว

      _init__() got an unexpected keyword argument 'categorical_features' sir I get this error when I specify categorical features

    • @sejalmittal1326
      @sejalmittal1326 4 ปีที่แล้ว

      @@urveshdave1861 Have you got any answer for this? I am having the same error

    • @sejalmittal1326
      @sejalmittal1326 4 ปีที่แล้ว

      @@urveshdave1861 okay .. i will do that. thanks

    • @tanvisingh9298
      @tanvisingh9298 4 ปีที่แล้ว

      @@urveshdave1861 Hey I am also getting the same error. how did you resolve it?

  • @venkatesanrf
    @venkatesanrf 4 ปีที่แล้ว +21

    Hi,
    Your explanation is very simple and effective
    Ans for practice session A)Price of Mercedes Benz -4Yr old--mileage 45000= 36991.31721061
    B)Price of BMW_X5 -7Yr old--mileage 86000=11080.74313219
    C) Accuracy=0.9417050937281082(94 percent)

    • @ANIMESH_JAIN04
      @ANIMESH_JAIN04 7 หลายเดือนก่อน

      Same bro

    • @fathoniam8997
      @fathoniam8997 6 หลายเดือนก่อน

      same bro.... thx for replying so that i can check my results

  • @jhagaurav8292
    @jhagaurav8292 6 ปีที่แล้ว +115

    Sir pls continue your machine learning tutorials ,yours tutorials are one of the best I have seen so far .

    • @codebasics
      @codebasics  5 ปีที่แล้ว +23

      sure Gaurav, I just started deep learning series. check it out

    • @samrahafeez5001
      @samrahafeez5001 3 ปีที่แล้ว +3

      @@codebasics
      Kindly explain the concept of dummies in deep learning as well

  • @sreenufriendz
    @sreenufriendz 5 ปีที่แล้ว +5

    Anyone can be a teacher , but real teacher eliminates the fear from students .. you did the same !! Excellent knowledge and skills

    • @codebasics
      @codebasics  5 ปีที่แล้ว

      Sreenivasulu, your comment means a lot to me, thanks 😊

  • @tagonniruha4856
    @tagonniruha4856 4 ปีที่แล้ว

    How to download these attached files from github
    Code in tutorial: github.com/codebasics/py/tree...
    Exercise csv file: github.com/codebasics/py/blob...

    • @codebasics
      @codebasics  4 ปีที่แล้ว

      Check video description and find the paragraph that starts with *to download CSV and code...* That section explains how to download those files

    • @tagonniruha4856
      @tagonniruha4856 4 ปีที่แล้ว

      @@codebasics thanks

  • @programmingwithraahim
    @programmingwithraahim 3 ปีที่แล้ว +49

    15:50 write your code like this:
    ct = ColumnTransformer(
    [('one_hot_encoder', OneHotEncoder(categories='auto'), [0])],
    remainder='passthrough'
    )
    X = ct.fit_transform(X)
    X
    Ok so it will work fine otherwise it will give an error.

    • @AxelWolf26
      @AxelWolf26 3 ปีที่แล้ว +1

      what is the use of this " (categories='auto') " and " 'one_hot_encoder' "

    • @jollycolours
      @jollycolours 2 ปีที่แล้ว +1

      Thank you, you're a lifesaver! I was trying multiple ways since categorical_features has now been depreciated.

    • @adilmajeed8439
      @adilmajeed8439 2 ปีที่แล้ว +8

      @@jollycolours correct, the categorical_features parameter is deprecated and for the same following are the steps needs to be followed;
      from sklearn.compose import ColumnTransformer
      ct = ColumnTransformer([('one_hot_encoder', OneHotEncoder(),
      [0])], remainder='passthrough')
      X = np.array(ct.fit_transform(X), dtype=float)

  • @noubaddi8567
    @noubaddi8567 4 ปีที่แล้ว +3

    This guy is AMAZING! I have spent 2 days trying decenes of other methods and this is the only one that worked for my data and didnøt come as an error, this guy totally saved my mental sanity, I was growing desperate as in DESPERATE! Thank you, thank you, thank you!

    • @codebasics
      @codebasics  4 ปีที่แล้ว +1

      I am glad it was helpful to you 🙂👍

  • @vaishalibisht518
    @vaishalibisht518 6 ปีที่แล้ว +11

    Wonderful Video.
    This so far the easiest explanation I have seen for one hot encoding. I have been struggling from very long to find a proper video on this topic and my quest ended today.
    Thanks a lot, sir.

  • @Genz111-o4r
    @Genz111-o4r 4 ปีที่แล้ว +25

    I was confuse from where to start studying ml and then my friend suggested this series.... It's great :-)

    • @rishabhjain7572
      @rishabhjain7572 4 ปีที่แล้ว

      any other courses or source you are following? and any development you have begun ?

    • @sauravmaurya6097
      @sauravmaurya6097 2 ปีที่แล้ว

      want to know how much this playlist is helpful? kindly reply.

    • @carti8778
      @carti8778 2 ปีที่แล้ว

      @@sauravmaurya6097 its quite helpful if u are a beginner. Beginner in sense of {not from engineering or programming background }. U can accompany this with coursera’s andrew ng course.

    • @carti8778
      @carti8778 2 ปีที่แล้ว +1

      @@sauravmaurya6097 if u already know calculus and python programming (intermediate level) , ML would feel easy . After doing this go to the deep learning series bcz thats what used in industries.

  • @shrutijain1628
    @shrutijain1628 4 ปีที่แล้ว +5

    this ML tutorial is by far the best one i have seen it is so easy to learn and understand and your exersise also helps me to apply what i have learn so far thank you.

  • @tech-n-data
    @tech-n-data 2 ปีที่แล้ว +4

    Your ability to simplify things is amazing, thank you so much. You are a natural teacher.

  • @ymoniem1
    @ymoniem1 4 ปีที่แล้ว +2

    you really made it very easy to understand such new concepts, Thanks a lot
    starting from mint 12:30 about OneHotEncoder . Some udpates in Sklearn prevent using categorical_features=[0]
    here is the code update as of April 2020
    from sklearn.preprocessing import OneHotEncoder
    from sklearn.compose import ColumnTransformer
    columnTransformer = ColumnTransformer([('encoder', OneHotEncoder(), [0])], remainder='passthrough')
    X = np.array(columnTransformer.fit_transform(x), dtype = np.str)
    X= X[:,1:]
    model.fit(X,y)
    model.predict([[1,0,2800]])
    model.predict([[0,1,3400]])

    • @petermungai5508
      @petermungai5508 4 ปีที่แล้ว

      The code is working but give a different prediction compared to dummies

    • @petermungai5508
      @petermungai5508 4 ปีที่แล้ว

      Plus my X is showing 5 column instead of 4

    • @petermungai5508
      @petermungai5508 4 ปีที่แล้ว

      I was entering the 0 and 1 wrongly. I am getting the same answer thank you for the code

    • @rameshkrishna1956
      @rameshkrishna1956 10 หลายเดือนก่อน

      thanks buddy

  • @hiver6411
    @hiver6411 3 ปีที่แล้ว +1

    the god of data science......Amazing explanation sir..kudos to your patience in explanation

    • @codebasics
      @codebasics  3 ปีที่แล้ว +1

      Glad it was helpful!

  • @tushargahtori1570
    @tushargahtori1570 2 ปีที่แล้ว

    Even in 23 your video is such a relief..kudos to your teaching.

  • @HashimAli-tz8fw
    @HashimAli-tz8fw ปีที่แล้ว +4

    I achieved the same result using a different method that doesn't require dropping columns or concatenating dataframes. This alternative approach can lead to cleaner and more efficient code
    df=pd.get_dummies(df,
    columns=['CarModel'],drop_first=True)

  • @bandhammanikanta1664
    @bandhammanikanta1664 5 ปีที่แล้ว +3

    First of all, 1000*Thanks for sharing such content on youtube..
    I got an accuracy of 94.17% on training data.

    • @codebasics
      @codebasics  5 ปีที่แล้ว

      Bandham, I am glad you liked it buddy 👍

  • @mk9834
    @mk9834 4 ปีที่แล้ว +3

    I was shocked after the first 5 minutes of the video and have never thought it would be so easy and fast! Thanks ALOT1

    • @codebasics
      @codebasics  4 ปีที่แล้ว +1

      Miyuki... I am glad you liked it

  • @shadabtechno
    @shadabtechno 11 หลายเดือนก่อน

    your are the best teacher on youtube , i have never seen before

  • @ankitparashar7
    @ankitparashar7 5 ปีที่แล้ว +52

    Merc: 36991.317
    BMW: 11080.743
    Score: 94.17%

    • @codebasics
      @codebasics  5 ปีที่แล้ว +7

      Your answer is perfect Ankit. Good job, here is my answer sheet for comparison: github.com/codebasics/py/blob/master/ML/5_one_hot_encoding/Exercise/exercise_one_hot_encoding.ipynb

    • @vishalrai2859
      @vishalrai2859 4 ปีที่แล้ว +2

      thanks for posting the answer bro

    • @mutiulmuhaimin9156
      @mutiulmuhaimin9156 4 ปีที่แล้ว +2

      Could we upvote this comment to the top? Been looking for this for quite some time now. This is important, and this comment matters.

    • @Augustus1003
      @Augustus1003 4 ปีที่แล้ว +4

      @@codebasics I used pandas dummy variable instead of using onehotencoding, because it is too confusing.

    • @clashcosmos4641
      @clashcosmos4641 4 ปีที่แล้ว +2

      Got the same answer using OneHotEncoder after correcting tons of errors and watching videos over and over.

  • @ZehraKhuwaja65
    @ZehraKhuwaja65 ปีที่แล้ว

    I must say this is the best course I've come across so far.

  • @snom3ad
    @snom3ad 5 ปีที่แล้ว +5

    This was really well done! Kudos to you! It's hard to find clear and concise free tutorials nowadays. Subscribed and hope to see more awesome stuff!

  • @wangangcwayi9420
    @wangangcwayi9420 4 ปีที่แล้ว +4

    You have gift of explaining things even to the layman. Big Up to you

    • @codebasics
      @codebasics  4 ปีที่แล้ว

      Thanks a ton Wangs for your kind words of appreciation.

  • @abhinavb717
    @abhinavb717 ปีที่แล้ว

    I am getting 84% accuracy without encoding variable, but after encoding i am getting 94% accuracy on model. Thank you for your teaching. Doing great Job

  • @omharne1386
    @omharne1386 2 ปีที่แล้ว

    I will say this is one of the best tutorial i have seen in ML

  • @phil97n
    @phil97n 4 หลายเดือนก่อน

    I'm reading a textbook that has an exercise to study this same dataset to predict survived. I just finished the exercise from the book - I can't seem to go past 81% score.
    Thanks for your awesome explanation

  • @tanmaykapure81
    @tanmaykapure81 3 ปีที่แล้ว +1

    This is the best machine learning playlist i have came across on youtube😃👍, Hats off to you sir.

  • @datasciencewithshreyas1806
    @datasciencewithshreyas1806 4 ปีที่แล้ว +1

    One of the best explanation for Encoding 👌👍

    • @codebasics
      @codebasics  4 ปีที่แล้ว

      Glad it was helpful!

  • @stimsona6859
    @stimsona6859 5 ปีที่แล้ว

    To understand the difference between LabelEncoder and OneHotEncoder "'medium.com/@contactsunny/label-encoder-vs-one-hot-encoder-in-machine-learning-3fc273365621"

  • @himanshusingh-vt9do
    @himanshusingh-vt9do 9 หลายเดือนก่อน

    my model score 94% Accuracy .Thankyou sir for amazing video.

  • @deekshithkumar3234
    @deekshithkumar3234 4 ปีที่แล้ว +1

    superb and precisely explained

  • @vishwa4908
    @vishwa4908 5 ปีที่แล้ว +2

    Awesome, you're explaining concepts in very simple manner.

    • @codebasics
      @codebasics  5 ปีที่แล้ว +2

      Vishwa I am happy to help 👍

  • @gokkulkumarvd9125
    @gokkulkumarvd9125 4 ปีที่แล้ว +4

    How can I like this video more than 100 times!

    • @codebasics
      @codebasics  4 ปีที่แล้ว

      I am happy this was helpful to you.

  • @geekyprogrammer4831
    @geekyprogrammer4831 3 ปีที่แล้ว

    This is really the best series to get started with ML

    • @shinosukenohara.123
      @shinosukenohara.123 3 ปีที่แล้ว

      How are u starting?

    • @codebasics
      @codebasics  3 ปีที่แล้ว +1

      Glad it was helpful!

    • @geekyprogrammer4831
      @geekyprogrammer4831 3 ปีที่แล้ว

      @@shinosukenohara.123 I am watching this channel, Krish Naik and Andrew NG course on Coursera

  • @maruthiprasad8184
    @maruthiprasad8184 3 ปีที่แล้ว

    For Mercedec benz I got 51981.26, for BMW i got 39728.19 & score is 94.17% . Thank you very much to make ML easy.

  • @ZOSELY
    @ZOSELY ปีที่แล้ว

    I wish I could give this videos 2 thumbs up! Great explanation of all the steps in one-hot encoding! Thank you!!

  • @weshallneversurrender
    @weshallneversurrender 2 ปีที่แล้ว

    The Data Science GOAT! One day I will send you a nice donation for all that you have contributed to my journey sir!

  • @nationhlohlomi9333
    @nationhlohlomi9333 ปีที่แล้ว

    A PLACE TO RUN TO WHEN ONE IS STUCK, THANK UOU SO MUCH SIR

  • @late_nights
    @late_nights 4 ปีที่แล้ว +10

    If anyone got struck at One hot encoder at 16:26 then type this command and execute pip install -U scikit-learn==0.20

  • @shekharbabar2496
    @shekharbabar2496 4 ปีที่แล้ว

    the best video series on ML sir ....Thank you very much sir....

  • @NoureddineBahi
    @NoureddineBahi 3 ปีที่แล้ว

    Think you very much...wonderful work..special think from Morocco in north of Africa

  • @timse699
    @timse699 3 ปีที่แล้ว +1

    You teach with passion! thank you for the series!

  • @bharathdwarakanath1587
    @bharathdwarakanath1587 4 ปีที่แล้ว

    The label encoding done for the independent variable column, 'town' in the second half of the video, I think, isn't needed. Instead just doing One Hot Encoding is enough. Wonderful contribution anyway. Thanks!!

  • @mallikasrivastava
    @mallikasrivastava 3 ปีที่แล้ว +1

    Your videos are awesome

    • @codebasics
      @codebasics  3 ปีที่แล้ว +2

      Glad you like them!

  • @piyushjha8888
    @piyushjha8888 4 ปีที่แล้ว

    model.predict([[45000,4,0,0]])=array([[36991.31721061]]),
    model.predict([[86000,7,0,1]])=array([[11080.74313219]]),
    model.score(X,Y)=0.9417050937281082.
    Thanks sir for these exercise

  • @ayushmanjena5362
    @ayushmanjena5362 2 ปีที่แล้ว +1

    15:50 write this code
    from sklearn.preprocessing import OneHotEncoder
    from sklearn.compose import ColumnTransformer
    ct = ColumnTransformer([('town', OneHotEncoder(), [0])], remainder = 'passthrough')
    x = ct.fit_transform(x)
    x

  • @jayshreedonga2833
    @jayshreedonga2833 2 ปีที่แล้ว

    thanks sir nice lecture
    sir you are really a great teacher
    you teach everything so nicely
    even tough thing becomes easy when you teach
    thanks a lot

  • @hamzazidan6093
    @hamzazidan6093 6 หลายเดือนก่อน

    Iam here from 2024 after 6 years and I want to say that this playlist is wonderful!
    I hope that you update it because there're many changes in the syntax of sklearn now

    • @codebasics
      @codebasics  6 หลายเดือนก่อน +1

      Hey next week I am launching an ML course on codebasics.io which will address this issue. It has the latest API, in depth math and end to end projects.

  • @istihademon1427
    @istihademon1427 หลายเดือนก่อน

    Highly Qualitative.

  • @komalsunandenishrivastava9211
    @komalsunandenishrivastava9211 4 หลายเดือนก่อน

    That image on one hot encoding 🤣🔥

  • @ramanandr7562
    @ramanandr7562 ปีที่แล้ว

    Thank you sir🎉. You made my ML Journey Better.. 🤩

  • @rooshanghous6912
    @rooshanghous6912 ปีที่แล้ว

    This is an amazing tutorial! saved me so much time and brought so much clarity!!! Thank you!

  • @farjadmir8842
    @farjadmir8842 4 ปีที่แล้ว

    I also got them correct. Sir, this course is amazing. You have made it so easy to understand.

    • @codebasics
      @codebasics  4 ปีที่แล้ว +1

      Glad to hear that

  • @manasaraju8552
    @manasaraju8552 2 ปีที่แล้ว

    difficult topics are easily understood, Thank you so much for the content sir

  • @srinivasreddy1709
    @srinivasreddy1709 4 ปีที่แล้ว +2

    Hi Dhaval, your explanation on all the topics is crystal clear.
    Can you please make videos on NLP also

  • @mapa5000
    @mapa5000 ปีที่แล้ว

    You make it easy with your explanation !! Thank you !!

  • @debaratighatak2211
    @debaratighatak2211 3 ปีที่แล้ว +1

    I learned a lot from the exercise that you gave at the end of the video, thank you so much sir!

  • @asamadawais
    @asamadawais 3 ปีที่แล้ว

    Simply excellent explanation with very simple examples!

  • @brijesh0808
    @brijesh0808 4 ปีที่แล้ว +2

    @13:20 we need to do :
    dfle = df.copy() ?
    because otherwise changes in dfle will reflect back to df
    Thanks :)

  • @sarafatima2252
    @sarafatima2252 4 ปีที่แล้ว

    definitely one of the best videos to learn from!

  • @thanusan
    @thanusan 6 ปีที่แล้ว +3

    Excellent video - thank you!

  • @leooel4650
    @leooel4650 6 ปีที่แล้ว +1

    Mercedes = array([[36991.31721061]])
    BMW = array([[11450.86522658]])
    Accuracy = 0.9417050937281082
    Thanks for your time and knowledge once again!

  • @elinem5311
    @elinem5311 4 ปีที่แล้ว +1

    thank you, this helped me so much with multivariate regression with many categorical features!

  • @regithabaiju
    @regithabaiju 4 ปีที่แล้ว

    Your tutorial video is helping so much for knowing more about ML.

    • @codebasics
      @codebasics  4 ปีที่แล้ว

      I am happy this was helpful to you.

  • @AruLcomments
    @AruLcomments 5 ปีที่แล้ว

    You are doing a wonderful job, people like you inspire me to learn and share the knowledge i gain. It is very useful for me. All the best.

  • @armagaan007
    @armagaan007 6 ปีที่แล้ว +11

    Wait wait... I don't see the point 😕
    The first half of the video does the same thing as one hot encoding(the second half of video)but second half is more tedious and takes more steps
    Then why not use the pd.get_dummies instead of onehotencoding???
    What's the advantage of using onehot?

    • @codebasics
      @codebasics  6 ปีที่แล้ว +9

      I personally like pd.get_dummies as it is convenient to use. I wanted to just show two different ways of doing same thing and there are some subtle differences between the two. Check this: stackoverflow.com/questions/36631163/pandas-get-dummies-vs-sklearns-onehotencoder-what-is-more-efficient

    • @armagaan007
      @armagaan007 6 ปีที่แล้ว +1

      @@codebasics thank you :]... btw you make grt videos

  • @felixgallo5132
    @felixgallo5132 3 ปีที่แล้ว

    They're basically the same however pd.dummy variables are easier to use.
    Thank u, sir.

  • @flamboyantperson5936
    @flamboyantperson5936 6 ปีที่แล้ว +6

    Please make regression video using preprocessing library with standaridization and normalization variables

  • @preetipisupati2308
    @preetipisupati2308 4 ปีที่แล้ว +2

    Thanks for the excellent video.. but due to the recent enhancements, ColumnTransformer from sklearn.compose is to be used for OneHotEncoding.

    • @codebasics
      @codebasics  4 ปีที่แล้ว

      Preeti, can you give me a pull request.

  • @mohammadismailhashime5239
    @mohammadismailhashime5239 2 ปีที่แล้ว

    Very nice explanation, appreciated

  • @indrakumari1854
    @indrakumari1854 4 ปีที่แล้ว

    Sir, very nice explained

    • @codebasics
      @codebasics  4 ปีที่แล้ว

      Glad it was helpful!

  • @purnanandabaisnab2856
    @purnanandabaisnab2856 2 ปีที่แล้ว

    nice teaching, really outstanding thanks a lot

  • @SrinivasA-vk7if
    @SrinivasA-vk7if 6 หลายเดือนก่อน

    Excellent video.., thank you so much.

  • @Adnan25048
    @Adnan25048 5 ปีที่แล้ว +1

    That's a great tutorial of one-hot encoding. I was unable to find a complete example anywhere. Thanks for sharing.

    • @codebasics
      @codebasics  5 ปีที่แล้ว +1

      Thanks Adnan for your valuable feedback

  • @prasadjoshi8213
    @prasadjoshi8213 4 ปีที่แล้ว +3

    Hi sir !! Most easier way u teach ML. Thanks a lot!!!. I m going through ur videos and assignments. I got the answer for merce: 36991.31, BMW:11080.74 & model score :0.9417. The Model score is 94.17%. My QUE is how to improve the Model score ??? Is there any way to apply the features?

  • @justchill2199
    @justchill2199 ปีที่แล้ว +1

    someone plz help!! at 15:14 getting an error for { y = df.price }
    It shows "AttributeError: 'DataFrame' object has no attribute 'price' "

    • @pranav9339
      @pranav9339 ปีที่แล้ว +1

      That means there no column labelled as price. Again redo it. You might have lost the column while executing some drop command multiple times.

  • @scriptfox614
    @scriptfox614 4 ปีที่แล้ว

    The import linear regression statement lol. Amazing tutorial. :D

  • @AbdulSamiasm
    @AbdulSamiasm 4 ปีที่แล้ว

    thanks for updating Eexerce code for oneHotEncoding

  • @indrakusuma8532
    @indrakusuma8532 29 วันที่ผ่านมา +1

    4:08
    why mine is true false not 0 1?

    • @zeroborgayary6639
      @zeroborgayary6639 10 วันที่ผ่านมา

      yeah it is showing the same for me, however you can try converting your dummies into int as:
      dummies = dummies.astype(int)
      This will convert true and false to 1 and 0 respectively

  • @pranavakailash8751
    @pranavakailash8751 3 ปีที่แล้ว

    This helped me a lot in my assignment, thank you so much code basics

  • @MrArunlama
    @MrArunlama ปีที่แล้ว

    I was learning through a paid course, and then I had to come here to understand this concept of dummy variable.

  • @rafibasha4145
    @rafibasha4145 2 ปีที่แล้ว

    @14:01,pls explain how come you applied label encoding for nominal categories ,morever LE should be applicable to target column only

  • @chamangupta4624
    @chamangupta4624 3 ปีที่แล้ว

    Beautiful explanation, very helpful

  • @swaruppanda2842
    @swaruppanda2842 5 ปีที่แล้ว +1

    nicely explained👌

  • @leelavathigarigipati3887
    @leelavathigarigipati3887 4 ปีที่แล้ว

    Thank you so much for the detailed step by step explanation.

    • @codebasics
      @codebasics  4 ปีที่แล้ว

      Glad it was helpful!

  • @honeybansal9165
    @honeybansal9165 2 ปีที่แล้ว

    bro at 16:43 onwarss, why you dropped first column ? and why you assigned the entire thing as X ?

  • @rachitbhatt40000
    @rachitbhatt40000 3 ปีที่แล้ว

    This module makes my code hot!

  • @avikarto
    @avikarto 6 ปีที่แล้ว +1

    This use of OneHotEncoder now appears to be deprecated. You may wish to make a note about this in the video.

    • @codebasics
      @codebasics  6 ปีที่แล้ว

      Check my pinned comment at the top (the first comment)

    • @avikarto
      @avikarto 6 ปีที่แล้ว

      @@codebasics Sure, on the official docpage here (scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html) the following is written regarding categorical_features: "Deprecated since version 0.20: The categorical_features keyword was deprecated in version 0.20 and will be removed in 0.22. You can use the ColumnTransformer instead."

  • @claude-olivierbatungwanayo9059
    @claude-olivierbatungwanayo9059 6 ปีที่แล้ว +1

    Excellent as usual!

  • @Dim-zt5ei
    @Dim-zt5ei 2 ปีที่แล้ว +1

    Great videos! Unfortunately it becomes harder and harder to code in the same time as the video because there are more and more changes in the libraries you use. For example sklearn library removed categorical_features parameter for onehotencoder class. It was also the case for other videos from the playlist. Would be great to have the same playlist in 2022 :)

    • @codebasics
      @codebasics  2 ปีที่แล้ว +1

      Point noted. I will redo this playlist when I get some free time from tons of priorities that are in my plate at the moment

    • @Dim-zt5ei
      @Dim-zt5ei 2 ปีที่แล้ว +1

      @@codebasics Thank you for the reply and again : Great job for all the quality tutorials!

  • @jayasreecarey7843
    @jayasreecarey7843 ปีที่แล้ว

    Many Thanks ! Great Explanation :)

  • @dineshgaddi1843
    @dineshgaddi1843 3 ปีที่แล้ว +4

    First of all thank you for making life easier for people (who want to learn Machine Learning). You explain really well. Big Fan. When I was trying to execute categorical_features=[0], it gave an error. It seems this feature has been depreciated in the latest version of scikit learn. Instead they are recommending to use ColumnTransformer. I was able to get the same accuracy 0.9417050937281082. Another thing i wanted to know, when you had initially used label encoder and converted categorical values to numbers, why we specified the first column as categorical, when it was already integer value ?

  • @ttowelie
    @ttowelie 4 ปีที่แล้ว +1

    in order not to remove the column by hand, you can use drop_first=True while using get_dummies.

  • @cahitskttaramal3152
    @cahitskttaramal3152 5 ปีที่แล้ว +5

    Thank you for wery well explained tutorial. I have one question though, you are training all of your data here and yet model score is only 0.95. Why is that? It must be 1. If you were to split your data and train it would make sense but your case doesn't. What am I missing here?

    • @codebasics
      @codebasics  5 ปีที่แล้ว +7

      Alper, It is not true that if you use all your training data the score is always one. Ultimately for regression problem like this you are trying to make a guess of a best fit line using gradient descent. This is still an *approximation* technique hence it will never be perfect. I am not saying you can never get a score of 1 but score less then 1 is normal and accepted.

  • @isaackobbyanni4583
    @isaackobbyanni4583 3 ปีที่แล้ว

    Thank you for this series. Such great help

    • @codebasics
      @codebasics  3 ปีที่แล้ว

      Glad it was helpful!

  • @infinity2creation551
    @infinity2creation551 ปีที่แล้ว

    Dil jeet liya , yahi khoj rha tha

  • @nelizaat
    @nelizaat 4 ปีที่แล้ว

    If anyone is interested, we can also skip the label encoder when using column transformer altogether by using the below :
    x=df[['town','area']].values
    y=df['price'].values
    from sklearn.compose import make_column_transformer
    ct = make_column_transformer(
    (OneHotEncoder(categories='auto'), [0]),
    remainder="passthrough"
    )
    X=ct.fit_transform(x)
    X = X[:, 1:]
    model.fit(X, y)

    • @codebasics
      @codebasics  4 ปีที่แล้ว +1

      Thanks neenu for the tip. The notebook in video description is actually updated to make use of column transformer.

    • @nelizaat
      @nelizaat 4 ปีที่แล้ว

      @@codebasics I am sorry I did not check that. Thank u sir for your videos, words are not enough to convey my gratitude for sharing your expertise to all.

  • @kmishy
    @kmishy 2 ปีที่แล้ว

    12:45 what is the need for label encoder? why can't use onehot encoder directly?