Normalization Vs. Standardization (Feature Scaling in Machine Learning)

แชร์
ฝัง
  • เผยแพร่เมื่อ 30 มิ.ย. 2024
  • In this video, we will cover the difference between normalization and standardization.
    Feature Scaling is an important step to take prior to training of machine learning models to ensure that features are within the same scale.
    Normalization is conducted to make feature values range from 0 to 1.
    Standardization is conducted to transform the data to have a mean of zero and standard deviation of 1.
    Standardization is also known as Z-score normalization in which properties will have the behavior of a standard normal distribution.
    Check top-rated Udemy courses below:
    10 days of No Code AI Bootcamp
    www.udemy.com/course/10-code-...
    Modern Artificial Intelligence with Zero Coding
    www.udemy.com/course/modern-a...
    Python & Machine Learning for Financial Analysis
    www.udemy.com/course/ml-and-p...
    Modern Artificial Intelligence Masterclass: Build 6 Projects
    www.udemy.com/course/modern-a...
    AWS SageMaker Practical for Beginners | Build 6 Projects
    www.udemy.com/course/practica...
    Data Science for Business | 6 Real-world Case Studies
    www.udemy.com/course/data-sci...
    AWS Machine Learning Certification Exam | Complete Guide
    www.udemy.com/course/amazon-w...
    TensorFlow 2.0 Practical
    www.udemy.com/course/tensorfl...
    TensorFlow 2.0 Practical Advanced
    www.udemy.com/course/tensorfl...
    Machine Learning Regression Masterclass in Python
    www.udemy.com/course/machine-...
    Machine Learning Practical Workout | 8 Real-World Projects
    www.udemy.com/course/deep-lea...
    Machine Learning Classification Bootcamp in Python
    www.udemy.com/course/machine-...
    MATLAB/SIMULINK Bible|Go From Zero to Hero!
    www.udemy.com/course/matlabsi...
    Python 3 Programming: Beginner to Pro Masterclass
    www.udemy.com/course/python-3...
    Autonomous Cars: Deep Learning and Computer Vision in Python
    www.udemy.com/course/autonomo...
    Control Systems Made Simple | Beginner's Guide
    www.udemy.com/course/control-...
    Artificial Intelligence in Arabicالذكاء الصناعي مبتدئ لمحترف
    www.udemy.com/course/artifici...
    The Complete MATLAB Computer Programming Bootcamp
    www.udemy.com/course/the-comp...
    Thanks and see you in future videos!
    #featurescaling #normalization
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 109

  • @samuelkoramoah3552
    @samuelkoramoah3552 ปีที่แล้ว +19

    this is by far the best explanation I've come across. So simple to understand. Thank you Prof. You just earned a follower!!

  • @bartekdurczak4085
    @bartekdurczak4085 วันที่ผ่านมา

    King !!! very good explantation. I watched multiple videos on yt and i asked Chatgpt many questions but now after your video i finally understand it

  • @alexismachado2262
    @alexismachado2262 2 ปีที่แล้ว +49

    Great explanation however i think saying scaling is not required for distance based algorithm is wrong as these algorithm are most affected by the range of features. Can you comment on this.

    • @rafaelposadas2341
      @rafaelposadas2341 7 หลายเดือนก่อน +2

      I think the same

    • @shahzarhusain3662
      @shahzarhusain3662 3 หลายเดือนก่อน +1

      Exactly! Scaling is crucial for distance based algorithm.

    • @bernardesp_
      @bernardesp_ หลายเดือนก่อน

      I believe that such as in the case of k-means, the algorithm calculates distances based on column versus same column as opposed to a neural network were each column can have a impact on target output.
      As distances are measured in the same scale (column x column), of course one feature is going to affect more clusterization {for instance}, but that's the point of k-means, we want to see which features describe data distribution across dimensions.

  • @jingyiwang5113
    @jingyiwang5113 11 หลายเดือนก่อน +2

    I am really grateful for your detailed explanation! I am self studying machine learning this summer holiday. And I am at this point now. I am so confused before watching your video. Now I finally understand this point. Thank you so much!

  • @1littlehelper
    @1littlehelper 7 หลายเดือนก่อน +2

    Hi Professor, thank you so much for this video! Clear and concise you have no idea how much I needed this. Keep up the great work, I will be sure to check out your other videos as well 😊

  • @vskraiml2032
    @vskraiml2032 2 ปีที่แล้ว +3

    Impressed with your way of teaching. You are explaining very well with the right examples... awesome work of you...
    One small request is that in your playlist sequence of 'Artificial Intelligence, Machine Learning, and Deep Learning' is jumbled, please keep the playlist in order for easy learning.

  • @ifeanyiedward2789
    @ifeanyiedward2789 ปีที่แล้ว +2

    Thank you so much Professor Ryan. You just made my life easy. best explanation. so simple to understand even for someone who doesnt have a background knowledge in machine learning.

  • @bogdancristurean73
    @bogdancristurean73 ปีที่แล้ว +2

    This was pretty clearly explained.
    For anyone else looking for this, the standardization chapter begins at 6:49.

  • @twanwolthaus
    @twanwolthaus 4 หลายเดือนก่อน +3

    Your explanation is as amazing as a rainbow cloud after a thunderstorm!!! I'm so glad I found this visual explanation!

  • @memories-f3n
    @memories-f3n ปีที่แล้ว +1

    Well explained about standardization and normalization.Now i got full clarity on these topics.Thanks for taking this effort and explaining in this way.

  • @57_faizalabdillah99
    @57_faizalabdillah99 ปีที่แล้ว +1

    Amazing Explanation.. Just in one run, i get your whole point in an easy way. Big Thanks

  • @beloaded3736
    @beloaded3736 หลายเดือนก่อน +1

    This professor is so pleasant for all senses. Thanks for sharing knowledge selflessly :)

  • @yasmineelezaby5197
    @yasmineelezaby5197 9 หลายเดือนก่อน +1

    Thank you so much! I couldn't wait to end this video before thanking you ! you made it super clear.

  • @atharvambokar573
    @atharvambokar573 ปีที่แล้ว +1

    This was such a crystal clear explanation! Thank you so much sir!

  • @Sickkkkiddddd
    @Sickkkkiddddd 2 ปีที่แล้ว +1

    Came here from your udemy course. You are a life saver, prof!

  • @albertoavendano7196
    @albertoavendano7196 ปีที่แล้ว +1

    Many thanks for this video... One of the best explanations ever seen by me

  • @PJ-od9ev
    @PJ-od9ev ปีที่แล้ว +1

    A great scientist and teacher. keep it up, sir. thank you.

  • @yosefasefaw4207
    @yosefasefaw4207 ปีที่แล้ว +1

    amazing video! clearly explained! Congratulation Professor !

  • @sukhwinder101
    @sukhwinder101 4 หลายเดือนก่อน +1

    For ML context : if data is following gaussian distribution ( bell shape) follow standard deviation else go with normalisation ( improves cluster scaling as well).

  • @nutanaigal9761
    @nutanaigal9761 ปีที่แล้ว +1

    thanks a lot ...worth watching..u explanined each concept in a simple way...

  • @amrittiwary080689
    @amrittiwary080689 2 ปีที่แล้ว +7

    Great video, would say we need scaling for distance-based as it will get wrong results if features are on different scales. We don't need scaling for tree-based as they are not susceptible to variance.

  • @catulopsae
    @catulopsae 10 หลายเดือนก่อน +1

    Awesome. I understand finally. Very good explanation. Easy to follow

  • @lethalgaming7087
    @lethalgaming7087 หลายเดือนก่อน +2

    Thank You Leonard Hofstadder..🙂

  • @anuradhabalasubramanian9845
    @anuradhabalasubramanian9845 2 ปีที่แล้ว +1

    Fantastic Explanation Sir ! Thanks so much !

  • @vijayarana2087
    @vijayarana2087 ปีที่แล้ว +1

    Many thanks for this video... One of the best explanations

  • @louisCS502
    @louisCS502 26 วันที่ผ่านมา

    the outlier thing is so crucial actually damn, i havent seen this is in a machine learning course before, banger

  • @AndromedHH
    @AndromedHH ปีที่แล้ว +1

    Fantastic explanation ! Thank you so much.

  • @mahamadounouridinemamoudou9875
    @mahamadounouridinemamoudou9875 ปีที่แล้ว +1

    thank you very much, I can't pass without thanking you and subscribe for the clarity you gave me on that topic

  • @fiqrifirdaus
    @fiqrifirdaus 11 วันที่ผ่านมา +1

    clear as a crystal, thankyou

  • @AbrahamStrange-tt4fv
    @AbrahamStrange-tt4fv ปีที่แล้ว +1

    Great explanation. Thank you very much, Sir!

  • @muhammadabdurrazaq2069
    @muhammadabdurrazaq2069 9 หลายเดือนก่อน +1

    Thank you for your best explanation as easy to understand

  • @tasnimsart3430
    @tasnimsart3430 ปีที่แล้ว +1

    Such a great explanation. Thank you very much

  • @sanjeevjangra84
    @sanjeevjangra84 3 หลายเดือนก่อน +1

    Awesome explanation. Thank you!

  • @shadyshawky6737
    @shadyshawky6737 2 ปีที่แล้ว +1

    Very Clear Explanation.
    Thank you :)

  • @user-yk3ec4fl5v
    @user-yk3ec4fl5v ปีที่แล้ว +1

    This is my first time that I am watching your video.. You look very ..very much similar to Saif Ali Khan.. In fact the smile is also same. One like vote from me. A gentle smile on face make you different from all the others.

  • @leixiao169
    @leixiao169 6 หลายเดือนก่อน

    Thank you for the clear explanation!

  • @KarinaRodriguez-tb6ol
    @KarinaRodriguez-tb6ol 2 ปีที่แล้ว +1

    Amazing explanation!

  • @algosavage7057
    @algosavage7057 2 ปีที่แล้ว +1

    good. clearly explained. thanks

  • @zanyatta1
    @zanyatta1 4 หลายเดือนก่อน

    The best simple explanation ever

  • @louisCS502
    @louisCS502 26 วันที่ผ่านมา

    thank you boss man, just used normalization instead of standardization, life saver

  • @EvaPev
    @EvaPev 7 หลายเดือนก่อน +1

    Outstanding content.

  • @jyothsnaraajjj
    @jyothsnaraajjj ปีที่แล้ว +1

    Excellent explanation.

  • @caliguy1260
    @caliguy1260 4 หลายเดือนก่อน

    Awesome explanation for a beginner like me. Wish I had access to the S&P 500 dataset.

  • @ItsTheGameDude
    @ItsTheGameDude 5 หลายเดือนก่อน +1

    Thank you so much, Prof!

  • @deepakkumar-ej1je
    @deepakkumar-ej1je 5 หลายเดือนก่อน +1

    Hello Professor, Video was able to explain the concepts and its practical implementation in a concise manner. Awesome work

  • @MariaDonayreJackson
    @MariaDonayreJackson 2 หลายเดือนก่อน +1

    Excellent thanks!!!

  • @user-zb5zi3ll3g
    @user-zb5zi3ll3g 4 หลายเดือนก่อน +1

    Informative!

  • @jimherebarbershop8188
    @jimherebarbershop8188 2 ปีที่แล้ว +1

    Gr8 explanation!!!

  • @4abdoulaye
    @4abdoulaye 2 ปีที่แล้ว +1

    Appreciated it, Thanks.

  • @saremish
    @saremish 11 หลายเดือนก่อน +1

    Excellent!

  • @remmaria
    @remmaria 2 ปีที่แล้ว

    Great explanation!! Could you say more about when the input is image datasets - like CNNs?

  • @gaberhassan3972
    @gaberhassan3972 9 หลายเดือนก่อน +1

    Great job 👏👏❤

  • @sololife9403
    @sololife9403 ปีที่แล้ว +1

    Thank you Prof!

  • @noonereally0007
    @noonereally0007 6 หลายเดือนก่อน

    hey professor, that was a very cool and simple video to follow and understand, could i ask for where i cold find the notebook you used at the end to use?

  • @poizn5851
    @poizn5851 2 ปีที่แล้ว +1

    Thank you it is helpful

  • @FRANKWHITE1996
    @FRANKWHITE1996 ปีที่แล้ว +1

    Thanks for sharing ❤

  • @muralidhargrao
    @muralidhargrao ปีที่แล้ว +1

    Hi Prof. Ryan,
    Thank you for explaining the subject in a simple manner.
    I have a Human Resources situation at hand. We have an employee appraisal system and the rating is on a 6 point scale (ranging from Poor performer to Outstanding performer). We have 15 departmental heads who rate their respective team members on this 6 point rating scale.
    However, there are immense biases that creep in during evaluation. Also, some evaluators are tougher/lenient than others. Consequently, we end up with different ranges/averages.
    As the ratings are linked to incentives, sometimes, good performers lose out against their peers in other departments.
    I intend to eliminate this bias/lack of neutrality which have been rated by 15 different departments (for 1000 employees). Can you suggest how I should go about this situation please.
    Regards...Muralidhar

  • @peaceadesina
    @peaceadesina ปีที่แล้ว

    Thank you!

  • @odosmatthews664
    @odosmatthews664 ปีที่แล้ว

    Can you show an example of scaling with train test split? Do you scale the train and test data with the same scaler?

  • @sanumioluwafemi7247
    @sanumioluwafemi7247 ปีที่แล้ว +1

    Thank you for this video

  • @zahra-pl1sk
    @zahra-pl1sk 2 หลายเดือนก่อน +1

    SUPEEEEEEEER clair. thanks

  • @samws_4
    @samws_4 10 หลายเดือนก่อน

    Helpful!

  • @joguns8257
    @joguns8257 ปีที่แล้ว +1

    Superb illustration.

    • @professor-ryanahmed
      @professor-ryanahmed  ปีที่แล้ว +1

      Thank you so much 😀

    • @joguns8257
      @joguns8257 ปีที่แล้ว

      @@professor-ryanahmed You're welcome, Prof. Please, the link to the dataset?

  • @dunwally2433
    @dunwally2433 ปีที่แล้ว +1

    Can you share the dataset you used for this demo pls?

  • @user-le2cc1yt8n
    @user-le2cc1yt8n 6 หลายเดือนก่อน +1

    رائع .. متميز

  • @mamounarakza5951
    @mamounarakza5951 7 หลายเดือนก่อน +1

    حبيبي يا بروف

  • @NickMaverick4
    @NickMaverick4 2 หลายเดือนก่อน +1

    Good theoretical explanation.. but I think scaling is used for k means, knn

  • @hasszhao
    @hasszhao 2 ปีที่แล้ว +1

    thx Prof

  • @believer8754
    @believer8754 16 วันที่ผ่านมา

    top explanation along with code, can you upload the notebook file with each video u explain . thanks

  • @patientmuke7008
    @patientmuke7008 ปีที่แล้ว

    For supervised algorithms, can we used both as data input ?

  • @asyakatanani8181
    @asyakatanani8181 11 หลายเดือนก่อน +1

    as always: outstanding! Your enthusiasm is inspiring... On the other hand, it is clear why tree-based algorithms do not require feature scaling. However, distance-based algorithms such as K nearest Neighbors and K-means require Euclidean Distance calculation which means that feature scaling is necessary with them. Am I wrong?

    • @whynot13
      @whynot13 8 หลายเดือนก่อน

      I think you should scale features for K-means and K-nn. Think about it intuitively. If you are looking at two points and their x y (feature) distances, how would you want to define their closeness? Do you want their features to be considered equally when calculating your distance or is one feature more important then the other ? If you want both x and y to be considered on equal playing fields, then you should scale them so that the distance computed reflects their importance.
      Scale each feature by the method that makes more since to that feature. This is most likely [0 to 1] across samples.

  • @zaldi19
    @zaldi19 ปีที่แล้ว

    Question, what if our model encounters bigger value than what we had in training data? How do we handle that

  • @anp9929
    @anp9929 ปีที่แล้ว +1

    you've not missed a single base brother. what an explain

  • @arjundev4908
    @arjundev4908 2 ปีที่แล้ว +1

    He used to be on Stemplicity as well.

  • @lucasgonzalezsonnenberg3204
    @lucasgonzalezsonnenberg3204 9 หลายเดือนก่อน

    Firstly, I like very much your explination.
    Secondly, I would like to know, how do you plot the row and rescalled data? Do you use the histograms function from pandas?
    Thank you very much and keep working so on!

  • @plowface
    @plowface 3 หลายเดือนก่อน

    I'm finding a lot of sources are saying feature scaling is advised when using k nearest neighbours. Is there more nuance to this point? Is scaling required after all?

  • @joguns8257
    @joguns8257 ปีที่แล้ว

    Please, where's the link to the dataset? I'd really appreciate if you can paste it here, Prof. Thanks a lot.

  • @amineazemour2476
    @amineazemour2476 ปีที่แล้ว +1

    top of the top

  • @liviumircea6905
    @liviumircea6905 ปีที่แล้ว +1

    Good

  • @user-lk1fd7lz3c
    @user-lk1fd7lz3c ปีที่แล้ว

    thx

  • @mahmodi5timetolearn
    @mahmodi5timetolearn 7 หลายเดือนก่อน +1

    The best, marhaba

  • @SaFFire123x
    @SaFFire123x 14 วันที่ผ่านมา

    Just came from a KMeans clustering course that demonstrates how normalization results in better clusters. But at 11:40, you say KMeans clustering doesn't require standardization or normalization. I'm confused.

  • @andyh3970
    @andyh3970 3 หลายเดือนก่อน

    could you put a link to the csv file so we can download and try the exercise ourselves please?

  • @alhelalyhossam
    @alhelalyhossam ปีที่แล้ว

    I really liked your explanation, thanks
    P.S.
    Are you Egyptian?
    I mean your accent is perfect, but your pauses while speaking give the intuition that you're from the Great Egypt.

  • @yamanarslanca8325
    @yamanarslanca8325 11 หลายเดือนก่อน

    11:40 wait I am confused now, because I thought that since the distance of the data is so important in algorithms such as kNN, SVM etc. scaling is a MUST pre-process step, but now you are saying that it is not required ? Could you please clarify this ?

  • @ARCsGARDEN
    @ARCsGARDEN ปีที่แล้ว

    Can you please share the github repo link for accessing the data files used in the video

  • @jeffkamuthu3276
    @jeffkamuthu3276 2 หลายเดือนก่อน +1

    A whole semester in 20 minutes

  • @TheOraware
    @TheOraware ปีที่แล้ว +1

    At 11:27 you mentioned in last bullets that scaling is not required for K-NN and SVM is not correct. K-NN and SVM exploits distances or similarities they do require scaling.

  • @ArvindKumar-vr4gf
    @ArvindKumar-vr4gf ปีที่แล้ว

    How to apply z score normalisation in live data ??? 🙏🙏🙏

  • @cvino0618
    @cvino0618 10 หลายเดือนก่อน

    Could've added this into your udemy course

  • @yossryasser2646
    @yossryasser2646 10 หลายเดือนก่อน

    where can I get the dataset?

  • @jiberuba8856
    @jiberuba8856 2 ปีที่แล้ว

    Thank you. Where I can download the notebook code?

    • @ShawnBecker11
      @ShawnBecker11 2 ปีที่แล้ว

      I also have this question

  • @chandrasekharnettem1537
    @chandrasekharnettem1537 ปีที่แล้ว

    distance-based methods assume that features are normalized?. feature scaling is required?. please confirm that?.
    tree-based does not need scaling

  • @things-tz8dj
    @things-tz8dj 5 หลายเดือนก่อน

    dataset please

  • @mariwanahmad9362
    @mariwanahmad9362 ปีที่แล้ว

    Dear Rayan, how to test a scaled data model.
    i used this way the predict value is very different
    X_testing=np.array([[550,440,110,0,0,0,0.33,400,8.8,0,863,771]]) #78.6
    ypred=model.predict(scaler.fit_transform( X_testing)) # predicted should be 78 , but i got [[0.17291696]]
    also without fit_transform also the value is different.
    many thanks for you replay.