Machine Learning Mastery
Machine Learning Mastery
  • 42
  • 226 476
What is KFold Cross Validation? When NOT to use it? How to use it with modifications for your data
KFold cross validation plays a very important role in understanding the variance play in your model. Most people take it for granted and don't use its full potential. I explain how to use it right, how to read its variance play and also highlight when NOT to use vanilla KFold ? But, rather use its extensions as implemented in SKLearn.
My AI and Generative AI Courses are details here:
ai.generativeminds.co
To get a FREE invite to our classes, fill below link:
invite.generativeminds.co
มุมมอง: 365

วีดีโอ

How to really find if my Test Data is diverging from my Training dataset? This WORKS!
มุมมอง 3716 หลายเดือนก่อน
Adversarial Validation is a practical method used for finding if testset (seen in production) has started to diverge from training set. We detail the scoring function and how you can implement this. Very effective for mixed tabular data usecases. My AI and Generative AI Courses are details here: ai.generativeminds.co To get a FREE invite to our classes, fill below link: invite.generativeminds.co
Use CentralLimit Theorem to turn any distribution to Normal ? Really?
มุมมอง 1937 หลายเดือนก่อน
Central Limit Theorem defines the law of large numbers. We list exactly what the law defines and how empirically non-gaussian distributions can be handled using this theorem for our applications. My AI and Generative AI Courses are details here: ai.generativeminds.co To get a FREE invite to our classes, fill below link: invite.generativeminds.co
How Bootstrapping helps with scoring your Train Test Divergences?
มุมมอง 1737 หลายเดือนก่อน
How do you score Train Test Divergences? Bootstrapping is one simple approach to hep you get a grip on this topic. Relying on random sampling methods, its statistically valid and practically a good reference point to be used along side Adversarial scoring techniques. My AI and Generative AI Courses are details here: ai.generativeminds.co To get a FREE invite to our classes, fill below link: inv...
How I built Generative AI for Retail in 60 Days
มุมมอง 515ปีที่แล้ว
Below is the link to an FREE interactive video where I explain the step by step path to building your own Generative AI for your businesses within 60 Days. Just follow the steps and you will get RESULTS !! WATCH it FREE here : how-to-llm.generativeminds.co/ My AI and Generative AI Courses are details here: ai.generativeminds.co To get a FREE invite to our classes, fill below link: invite.genera...
Bayesian Optimization - Math and Algorithm Explained
มุมมอง 48K3 ปีที่แล้ว
Learn the algorithmic behind Bayesian optimization, Surrogate Function calculations and Acquisition Function (Upper Confidence Bound). Visualize a scratch implementation on how the approximation works iteratively. Finally, understand how to use scikit-optimize package todo hyperparameter tuning using bayesian optimization. My AI and Generative AI Courses are details here: ai.generativeminds.co ...
Decision Tree Hyperparam Tuning
มุมมอง 3.7K3 ปีที่แล้ว
Learn how to use Training and Validation dataset to find the optimum values for your hyperparameters of your decision Tree. Demonstrated for - Max Tree Depth and Min Sample Leaves hyper parameters. My AI and Generative AI Courses are details here: ai.generativeminds.co To get a FREE invite to our classes, fill below link: invite.generativeminds.co
Decision Tree Cost Pruning - Hands On
มุมมอง 2.3K3 ปีที่แล้ว
In this handson video you will Learn how to find the right Cost Pruning Alpha parameter for your decision tree. My AI and Generative AI Courses are details here: ai.generativeminds.co To get a FREE invite to our classes, fill below link: invite.generativeminds.co
Gradient Boosting Hands-On Step by Step from Scratch
มุมมอง 2.7K3 ปีที่แล้ว
Learn how to write gradient boosting tree algorithm from scratch. Learn how the Loss function is derived and applied into python code as part of your boosting iteration. Learn a trick to present your charts as interpretable categorical values rather than encoded numerical values. (This is done a lot in practice) My AI and Generative AI Courses are details here: ai.generativeminds.co To get a FR...
Hyperparameters - Introduction & Search
มุมมอง 5K3 ปีที่แล้ว
We cover: 1. What are Hyperparameters and their difference from model parameters. 2. Why Hyperparameter tuning is important with 2 examples from Deep Learning 3. Searching Hyper parameters - Grid vs RandomSearch 4. What is the mathematical edge for Random Search over Grid Search My AI and Generative AI Courses are details here: ai.generativeminds.co To get a FREE invite to our classes, fill bel...
Feature Importance Formulation of Decision Trees
มุมมอง 6K3 ปีที่แล้ว
Learn the Feature importance formulation for both single decision tree and for multiple trees, illustrated with a simple example. My AI and Generative AI Courses are details here: ai.generativeminds.co To get a FREE invite to our classes, fill below link: invite.generativeminds.co
How to Regularize with Dropouts | Deep Learning Hands On
มุมมอง 5973 ปีที่แล้ว
1. Dropout - Benefits, Effect on your architecture and Types of Dropouts. 2. How to implement Dropout with Weight Constraints? 3. How to implement Dropout and still maintain the original capacity of the network? 4. How to find the ideal Dropout Ratio for your architecture? 5. How important are Activation and Gradient Distributions in deciding the dropout rate for your architecture? All this wit...
How to Regularizing with Weight & Activation Regularizations | Deep Learning
มุมมอง 5463 ปีที่แล้ว
1. How to regularize neural networks using Weight and Activation Regularizations. 2. How Weight & Activity Regularizations are two sides of the same coin. 3. What are the signatures of Activation distribution for your architecture and How to understand if you are correctly optimizing your hyper parameters for regularization 4. Identify the signature of "optimal" Activation distributions using f...
How to Fix Vanishing & Exploding Gradient Problems | Deep Learning
มุมมอง 2.8K3 ปีที่แล้ว
1. How to identify if you are facing a Vanishing or Exploding Gradient problem. Take a classification example and understand the signature of convergence with and without gradient problems. 2. Fix Vanishing Gradient with Relu & correct Weight Initializations. 3. What causes exploding gradient? Take a regression example & analyze how it looks when we do sensitivity analysis on the architecture. ...
How to Accelerate training with Batch Normalization? | Deep Learning
มุมมอง 7333 ปีที่แล้ว
1. How to perform Sensitivity analysis of your neural network architecture when Scaling and Batch Normalization is part of your design. 2. Understand how scaling really benefits convergence properties for your architecture. 3. Input Scaling & Output Scaling benefits 4. What does Batch Normalization really do? 5. How BatchNormalization boosts both the speed of convergence & smoothness of converg...
What is a Perceptron Learning Algorithm - Step By Step Clearly Explained using Python
มุมมอง 21K3 ปีที่แล้ว
What is a Perceptron Learning Algorithm - Step By Step Clearly Explained using Python
How to Tune Learning Rate for your Architecture? | Deep Learning
มุมมอง 1.4K3 ปีที่แล้ว
How to Tune Learning Rate for your Architecture? | Deep Learning
How to Find the Right number of Layers/Neurons for your Neural Network?
มุมมอง 12K3 ปีที่แล้ว
How to Find the Right number of Layers/Neurons for your Neural Network?
How to Configure and Tune Batch Size for your Neural Network?
มุมมอง 2.6K3 ปีที่แล้ว
How to Configure and Tune Batch Size for your Neural Network?
Back Propagation Math Step By Step Detailed with an Example | Deep Learning
มุมมอง 2.4K3 ปีที่แล้ว
Back Propagation Math Step By Step Detailed with an Example | Deep Learning
Back Propagation Concept Math Step By Step for a Two Layer Feed Forward Network
มุมมอง 4443 ปีที่แล้ว
Back Propagation Concept Math Step By Step for a Two Layer Feed Forward Network
How Gradient Descent finds the weights? Gradient Descent Math Step By Step with Example | Neural Net
มุมมอง 12K3 ปีที่แล้ว
How Gradient Descent finds the weights? Gradient Descent Math Step By Step with Example | Neural Net
How to use Gaussian Mixture Models, EM algorithm for Clustering? | Machine Learning Step By Step
มุมมอง 18K4 ปีที่แล้ว
How to use Gaussian Mixture Models, EM algorithm for Clustering? | Machine Learning Step By Step
Principal Component Analysis (PCA) Maths Explained with Implementation from Scratch
มุมมอง 6284 ปีที่แล้ว
Principal Component Analysis (PCA) Maths Explained with Implementation from Scratch
How to cluster using Hierarchical Clustering Algorithm | Machine Learning Step By Step
มุมมอง 7054 ปีที่แล้ว
How to cluster using Hierarchical Clustering Algorithm | Machine Learning Step By Step
DBSCAN Math and Algorithm Explained Step by Step
มุมมอง 1.7K4 ปีที่แล้ว
DBSCAN Math and Algorithm Explained Step by Step
KMeans Clustering Math Assumptions & Algorithm Explained - When Not to use? , How it works?
มุมมอง 1.5K4 ปีที่แล้ว
KMeans Clustering Math Assumptions & Algorithm Explained - When Not to use? , How it works?
XGBOOST Math Explained - Objective function derivation & Tree Growing | Step By Step
มุมมอง 9K4 ปีที่แล้ว
XGBOOST Math Explained - Objective function derivation & Tree Growing | Step By Step
What is Extreme about XGBoost?, Why XGBoost wins Kaggle?, Algorithmic, Model & System Optimizations.
มุมมอง 2.7K4 ปีที่แล้ว
What is Extreme about XGBoost?, Why XGBoost wins Kaggle?, Algorithmic, Model & System Optimizations.
Gradient Boosting - Math Clearly Explained Step By Step | Machine Learning Step By Step
มุมมอง 5K4 ปีที่แล้ว
Gradient Boosting - Math Clearly Explained Step By Step | Machine Learning Step By Step

ความคิดเห็น

  • @mshika2150
    @mshika2150 11 วันที่ผ่านมา

    can i get the code ?

  • @khemchand494
    @khemchand494 หลายเดือนก่อน

    Very well explained. I got the complete intuition of GMMs in a go.

  • @vrhstpso
    @vrhstpso หลายเดือนก่อน

    😀

  • @sm-pz8er
    @sm-pz8er 3 หลายเดือนก่อน

    Very well simplified explanation. Thank you

  • @prabhjot-ud6ru
    @prabhjot-ud6ru 3 หลายเดือนก่อน

    best ever explanation for GMM. Thanks a lot for such a helpful video.

  • @benheller472
    @benheller472 4 หลายเดือนก่อน

    Hello, I’ve been watching your videos. Thank you! They are great. Is there a way to contact you directly?

  • @9951468414
    @9951468414 4 หลายเดือนก่อน

    Which reference book do you use?

  • @9951468414
    @9951468414 4 หลายเดือนก่อน

    Hello there, Can you give the material notes

  • @VIVEK_InLoop
    @VIVEK_InLoop 5 หลายเดือนก่อน

    Nice sir

  • @xiaoyongli
    @xiaoyongli 6 หลายเดือนก่อน

    well done in <15 min!!! highly recommended

  • @DM-py7pj
    @DM-py7pj 6 หลายเดือนก่อน

    is the end of the video missing?

    • @machinelearningmastery
      @machinelearningmastery 6 หลายเดือนก่อน

      Content is not missing. Its a bit short , That's all.

  • @nashtashasaint-pier7404
    @nashtashasaint-pier7404 7 หลายเดือนก่อน

    This seems to be correct if and only if you assume that your three models are independant. This is fine, but I think this does not say much in practical cases, as it is very unlikely that you will have 3 base learners that are not correlated. In general, it seems pretty complicated to come up with a "comprehensive" formula that takes into account the respective covariances of these three models with each others and expresses the probabilistic advantage ensembling has.

    • @machinelearningmastery
      @machinelearningmastery 6 หลายเดือนก่อน

      The formulation has been the premise of why variance reduces theoretically when ensembling is in place compared to independent models. From practical standpoint, it works well which is why random forest is such a star with so many hyperparams to ensure you get different trees as much possible across 100s of features faced in real applications.

  • @VictorTimely-9
    @VictorTimely-9 7 หลายเดือนก่อน

    More on Statistics.

  • @wenkuchen
    @wenkuchen 7 หลายเดือนก่อน

    very clear explanation for decision tree features importance, thanks

  • @meha1233
    @meha1233 7 หลายเดือนก่อน

    You should mention the normalized method. I kill myself to find out how to normalize those numbers

    • @machinelearningmastery
      @machinelearningmastery 7 หลายเดือนก่อน

      Which normalization would you like to see? The wgt computation in each iteration is normalized. Could you clarify.

  • @countrylifevlog524
    @countrylifevlog524 7 หลายเดือนก่อน

    can you provide these slides

  • @tomryan7679
    @tomryan7679 8 หลายเดือนก่อน

    @machinelearningmaster Great video, thanks! Could you please share the dataset used so that we can replicate this?

  • @namanjha4964
    @namanjha4964 8 หลายเดือนก่อน

    Thanks a lot for the video

  • @saleemun8842
    @saleemun8842 8 หลายเดือนก่อน

    by far the clearest explanation of bayesian optimization, great work, thanks man!

  • @Xavier-Ma
    @Xavier-Ma 8 หลายเดือนก่อน

    Wonderful explaination! Thanks professor.

  • @YuekselG
    @YuekselG 9 หลายเดือนก่อน

    is there a mistake in 9:10 ? there is 1 f(x) too much i think. Has to be N(f(x_1), ... (x_n) l o, C*)) / N(f(x_1), ... (x_n) l o, C)). Can anyone confirm this? ty

  • @syedtalhaabidalishah961
    @syedtalhaabidalishah961 9 หลายเดือนก่อน

    what a video!!! simple and straight forward

  • @Goop3
    @Goop3 9 หลายเดือนก่อน

    Very intuitive explanation!! Thank you so much! I found this gem of a channel today!

  • @hosseindahaee2886
    @hosseindahaee2886 9 หลายเดือนก่อน

    thanks but there is a typo in y=-1 wtx+b<= -1 not wtx+b<= 1

  • @gvdkamdar
    @gvdkamdar 10 หลายเดือนก่อน

    This entire series is one of the most comprehensive explanations I have found for SVMs. Extremely grateful for it

  • @agc444
    @agc444 10 หลายเดือนก่อน

    Wonderful video, many thanks. Perhaps it would be nice if you made the code available for us learners to play with. Thanks.

  • @saremish
    @saremish 10 หลายเดือนก่อน

    Very clear and informative. Thanks!

  • @hatemmohamed8387
    @hatemmohamed8387 10 หลายเดือนก่อน

    is there any repo containing the codes for the entire playlist

  • @mahdiyehbasereh
    @mahdiyehbasereh 11 หลายเดือนก่อน

    Why don't we inherit from the keras.model class? Thanks alot for your tutorials

    • @machinelearningmastery
      @machinelearningmastery 6 หลายเดือนก่อน

      Yes, you can do that and make it easier to use in multiple places.

  • @ywbc1217
    @ywbc1217 11 หลายเดือนก่อน

    extremely not good explanations

  • @dhanushka5
    @dhanushka5 11 หลายเดือนก่อน

    Thanks

  • @Ruhgtfo
    @Ruhgtfo 11 หลายเดือนก่อน

    Best explanation find the most, thank-you

  • @DilipKumar-dc2rx
    @DilipKumar-dc2rx ปีที่แล้ว

    You taught better than my instructor 🙂

  • @farhaddotita8855
    @farhaddotita8855 ปีที่แล้ว

    Thanks so much, the best explanation of xgBoost I´ve seen so far, most people doesnt matter about the math intuition!

  • @JLBorloo
    @JLBorloo ปีที่แล้ว

    Good stuff but consider sharing the Notebooks in the future

  • @chinmayb172
    @chinmayb172 ปีที่แล้ว

    Can you please tell me if I have 10 classes of training data, what number of epochs should I use?

    • @machinelearningmastery
      @machinelearningmastery ปีที่แล้ว

      In general, I recommend that we set epochs to very large value say 50,000. Then in your code you setup early exit logic as part of training. This will work best for most cases since the training fit will automatically exit when convergence has happened. Hope that helps.

  • @fardian6818
    @fardian6818 ปีที่แล้ว

    I am a silent internet user, what I usually do when I like a content is just by pressing the like button and save the link on the txt file, but this time is an exception, your content is very simple and completely what I'm looking for. I write you a comment, as the first commentator in this video 😀 You have a new subscriber now. Keep up the good work

  • @isultan
    @isultan ปีที่แล้ว

    Wow!!! Excellent lecture!!

  • @mikehawk4583
    @mikehawk4583 ปีที่แล้ว

    Why do you add the mean of the predicted points back to the predicted points?

    • @machinelearningmastery
      @machinelearningmastery ปีที่แล้ว

      Lets see if can correlate it with a hypotheses that humans would do to learn. Lets say we are in a Forest & searching for trails of human foot marks to get out of it. Every time we find a footprint, we valid & learn about surroundings, vegetation, terrain,etc. Over a period of time we learn ehat leads to exit And what doen't. That precisely the idea here. Hope that helps.

    • @mikehawk4583
      @mikehawk4583 ปีที่แล้ว

      @@machinelearningmastery I'm sorry but I still don't get it. You can explain it with more math. What I don't get is after predicting a miu, why do we need to add omega? Like what does omega do where?

  • @abhishekchaudhary6975
    @abhishekchaudhary6975 ปีที่แล้ว

    NIce video!!

  • @mohammedakl2077
    @mohammedakl2077 ปีที่แล้ว

    thank you

  • @vipuldogra6600
    @vipuldogra6600 ปีที่แล้ว

    The best there is.

  • @yurigansmith
    @yurigansmith ปีที่แล้ว

    In this example the new weights for the formerly misclassified examples are increased, while the weights for the correctly classified are decreased (which seems reasonable to me at the moment). But if e_t becomes greater than 0.5, lambda_t becomes negative and the direction of the weight adaptation is swapped, which would lead to undersampling of the misclassified and oversampling of the correctly classified examples in the next round. Is lambda "allowed" to become negative in the first place? Somewhere (slides on boosting algorithms) I read that lambda is supposed to be non-negative, but I'm not sure if I understood the statement resp. context of the statement correctly.

    • @machinelearningmastery
      @machinelearningmastery ปีที่แล้ว

      Great Question. A. First, there are two weights in the system - one driving the weight of data point and another driving weight of the classifier. I shall explain both in a second. B. Second, the fact is error >0.5 will give negative lambda. This is part of the design as I shall clarify below. So, when error<0.5 then lambda >0; when error = 0.5, then lambda =0; when error > 0.5 then lambda <0. This means, the model actually goes nowhere when error is exactly 0.5. This is why in implementations, they creatively make sure it won't match this magic number and break the model. Now, back to the first point about weights. Let me give with examples. Say error is 0.1, then Lambda=1.10. Wgt for correctly classified data point is 0.33 weight for misclassified data point=3.0; Weight of this classifier is 1.10 (the lambda itself) Say error is 0.9, then Lambda=-1.10. Wgt for correctly classified data point is 3.0 weight for misclassified data point=0.33; Weight of this classifier is -1.10 (the lambda itself) If we see above, few things are clear 1. If a classifier is classifying well (with low error rate), its overall weight in the ensemble stack in positive. And whatever misclassification is there is given priority in the next run. 2. If a classifier is poor (with high error rate), its overall weight in the ensemble stack is deeply negative. And whatever classification happened correctly, we continue to keep that going for the next run with the hope we can improve its influence gradually. Hope that clarifies. Let me know if there is still an open point on this.

  • @kevinchaplin672
    @kevinchaplin672 ปีที่แล้ว

    Very nice, clear and concise

  • @stefanmisanovic3341
    @stefanmisanovic3341 ปีที่แล้ว

    Great video, very helpful to get ideas how to visualize the accuracy of a model and how to automate the process.

  • @ranaiit
    @ranaiit ปีที่แล้ว

    Thanks....missing negative sign in exponent of Gaussian function !

  • @bigh8438
    @bigh8438 ปีที่แล้ว

    what about the bias term? can you do one with b?

    • @machinelearningmastery
      @machinelearningmastery ปีที่แล้ว

      Yes, bias term do exist. Remember whenever you see W matrix in the optimization space, W = [w,b]. The optimizer is working on W as a whole, so its a parallel update so that it can calculate the actual loss and minimize it. I will try to see if I can do a video to illustrate this point. Also, as you progress in your ML study, you may come across models like RNN etc. -- in such cases the W = [w,u,b]. Therefore, to generalize our learning, it is important to see that W is a matrix that contains all the weight-units (includn bias) that the optimizer must search and input into the system to get a loss. Hope that clarifies.

    • @bigh8438
      @bigh8438 ปีที่แล้ว

      @@machinelearningmastery I see, what I meant was the video we have function = wx but if it was possible to do function = wx + b, I think it is the same but just another term added

    • @machinelearningmastery
      @machinelearningmastery ปีที่แล้ว

      Yes, that is correct , bias would be another parameter to be found by GD along with all other weights it finds for each input features.

  • @Darazfinds4350
    @Darazfinds4350 ปีที่แล้ว

    Can you tell me the values to be put to calculate gradient for 1st iteration ???

    • @machinelearningmastery
      @machinelearningmastery ปีที่แล้ว

      For use in say python snippet or excel to get similar results as in my video, you may use a small value like 0.01 to start off. Such small values should not impact your convergence time. If we use large values (and say we got it wrong), it would take longer to converge. But, what happens in practice with NeuralNet training is -- use the first 10 or 25 epochs as "warm up" epochs. Generally, large learning rate is used during the warmup epochs and purpose of warm up is to get some of the parameters initialized. Following that, actual training would start. Hope that clarifies.

  • @enter-galactic
    @enter-galactic ปีที่แล้ว

    Can you provide code on how to plot sensitivity analysis?

    • @machinelearningmastery
      @machinelearningmastery ปีที่แล้ว

      If you are on Keras, its a 2 liner, Let me give you code here: #fit the model and capture return value of history history = model.fit(X_train, y_train, epochs=epochs,.....) #plot how validation vs training curves look using the "history" variable. plt.plot(history.history['accuracy'], label='train') plt.plot(history.history['val_accuracy'], label='test') plt.legend() plt.show()

  • @skullywully653
    @skullywully653 ปีที่แล้ว

    would it be possible for me to grab that code somewhere?

    • @machinelearningmastery
      @machinelearningmastery ปีที่แล้ว

      I have gone over the code in the video. Any particular segment not clear?