🐍My heart is pounding so hard! I can't believe it was that 'simple' to make a model!! It gives me so much confidence to start learning more and add more models to the code I've made
Amazing- I really liked how casually you would think to add colors, trend line and straight away go and add a few lines to reflect in output. Shows how comfortable you’re- for a beginner like me- that was so instructive when you speak out your thoughts on the go.
🎯 Key Takeaways for quick navigation: 00:00 📋 *Introduction to Building Your First Machine Learning Model in Python* - Overview of building a machine learning model in Python using scikit-learn and Google Colab. - Naming conventions and organizing Jupyter Notebooks. 01:12 📊 *Loading and Exploring the Dataset* - Introduction to the delani dataset, which contains information about molecule solubility. - Explanation of comma-separated values (CSV) format and dataset columns. 05:11 📦 *Data Preparation: Splitting Data into Features (X) and Target (Y)* - How to separate the dataset into features (X) and the target variable (Y). - Explanation of the training set and testing set split. 09:22 ⚙️ *Building a Linear Regression Model* - Importing the Linear Regression model from scikit-learn. - Training the model on the training dataset. 13:18 🧪 *Model Evaluation: Mean Squared Error and R-squared* - Calculating Mean Squared Error (MSE) and R-squared for both training and testing sets. - Organizing and presenting the results in a Pandas DataFrame. 22:13 🌲 *Building a Random Forest Regressor Model* - Introduction to using the Random Forest Regressor model. - Organizing notebook sections and headings for clarity in the notebook structure. 23:47 📊 *Building Regression Models* - Explains the distinction between regression and classification models based on the nature of the target variable. - Demonstrates how to create a random forest regressor with specified parameters. - Covers the training process of the regression model. 25:01 📈 *Model Performance Evaluation* - Discusses using mean squared error and R2 score for evaluating model performance. - Shows how to apply these metrics to the random forest regression model. - Emphasizes being cautious about typos in code. 26:08 🧷 *Combining Model Results* - Explains the process of combining the results of linear regression and random forest regression models. - Demonstrates how to concatenate the results into a single DataFrame. - Provides tips on reindexing and organizing the combined data. 28:03 📊 *Data Visualization of Prediction Results* - Introduces data visualization using matplotlib for comparing predicted and actual values. - Guides in creating a scatter plot with labeled axes. - Adds a trendline to the plot using numpy's polyfit function. 29:56 🎉 *Conclusion and Further Exploration* - Summarizes the process of building a machine learning model in Python using scikit-learn. - Encourages viewers to explore different models, tweak learning parameters, and refer to scikit-learn's documentation. - Requests viewers to share their model-building experiences in the comments. Made with HARPA AI
I didn't have any prior knowledge of Data Science or Machine Learning, but as a visual learner, I finally understand the purpose of the mathematical equation y = f(X). Initially, after watching a couple of videos and starting from a math tutorial, I was confused about the relevance of math in this field. But now I see its importance, and I am grateful for this new understanding. Thanks @DataProfessor
🐍This was my first video on ML and you made it really easy to follow through. I did this because I wanted to feel how it is like to do this , and I like it. I will definitely follow this through and study the actual maths behind it. Thank you for keeping me motivated.
Excellent video. You explained everything very simple. Which i learnt 1 month in classroom you just explained in few minutes. It is quick recap for me. Thank you..
🐍Yo, this is my first ML project ever. Thank you. Very clear and concise. I need to go over this again and again and learn from this as well as other tutorials and courses.
I have been spending the past 4 months trying to actually do something tangible but haven't found anything practical like this one Just took an hour and half I belive and I am done with my first mach ine learning modle This has enhanced my confience and now I am going to build more Thankf Professor
Awesome as always.. I'm applying this to my QSAR dataset in a bit.. Thanks for churning out great n useful content as always.. You d bestest Data Professor.
🐍 This was SO helpful. I feel like I had just been going through the motions in that process before, but you explained WHY each step is done. And I will definitely be using headings from now on to create that table of contents. Thanks, Data Professor, for another great video!
Thank you so much for this video it is very helpful and specially you explain every line by line how code actually work. This is better than my college sir
before this video, Machine learning was sorcery and scary. Not anymore. Thank you for your neat and casual explanations that make me love the science and the math behind those functions. I am really looking forward to giving it a shot and learning more. thanks a lot. like and sub-earned
The best ML tutorial i just watched. Hat's off for your work sir. I really enjoyed and explanations were really great. Thanks again. You earned a subscriber here. keep up the great work sir.
🐍 great video, simple and direct approach, simple models, simple problem. It'd be nice to have more of those simple and direct videos with simple classification and clustering problems, just to get a grasp on them :D
🐍 Your explanation was clear, concise, and appreciated! You are an amazing teacher who hits all the important points without adding any confusion. Thank you for your time in doing this!
Excellent video to help understand the basic behind training new models. Just one suggestion, I missed a little more explanation on how the different parameters of the training algorithm would affect the end result.
Thank you so so much for this tutorial, your videos deserve much more views! Could you please also do a video explaining how to do k-cross validation as well as the methods to measure and compare the accuracy of ML models (confusion matrix, F1 score ect.)? Thank you so much for your videos I learn much more from you than my other profs at the university!
Hi! Great video. A quick question... when do we use linear regression as in statistics and when do we use it as a ML algorithm? Both try to find coefficients, both try to minimize the error (however it is defined) but I am still confused on what is the difference between the classical linear regression I learnt in basic algebra course and the ML procedure. Thanks!
Python really is a whole different beast.. Just to accomplish the task we used numpy, pandas, sklearn.model_selection, sklearn.linear_model, sklearn.metrics, sklearn.ensemble, and matplotlib. That's 7 libraries.. So overwhelming... Thanks for the guide. I did it locally within VSCode, so had to keep tweaking and installing libraries along the way, but it worked out well. The ML journey begins. Thanks for the training wheels!
Liking the video so far, but at 15:20, I’m getting a “name ‘X_train’ is not defined” error? Also it’s saying “name ‘X’ not defined” in the data splitting section
What is the difference between the model (linear regression) developed by Google Colab and Microsoft Excel? Here you quoted that the model developed is using a machine learning model (using Colab), can we say the model developed using Mircosoft Excel is also a Machine Learning model?
This was a big help, thank you! Question: what's the significance of the value chosen for random_state? I get that you need a number so that you'll get the same split each time, but I've seen some people use random_state=100 and some use random_state=200. I don't understand the difference. If anyone could shed a little light on that for me, I'd be very grateful. Thanks
Hi Dalton, random_state is a parameter that specifies the randomness used by the function of interest whether it be data splitting or ML model building. A specific number would split the data in a certain subset while an ML algorithm may apply a certain random value as the initial weight values. To get the exact same results we would need to use the same number, however if the random_state number changes we could think of this as like shuffling a deck of card to get a different subset of data or a different starting point for the weight values.
@@DataProfessor OK I thiiiink I got it...random_state splits the data into subsets of the specified size? Why does it need to be split if you're training/testing the entire data frame? Sorry, I'm just brand new to this. Any recommendations you have straightforward reading on this somewhere online so that I'm not bombarding you with super basic questions would also be good. Thanks again!!!
Anyway u have a dataset for clap sound like single, double and triple that makes a function play/pause, previous and next in the music player pls help me with this anyone
The part I did not unterstand was the evaluation. For example. In the end we got a Training MSE fo Liniar regression of 1.007. Is the good or bad? Is the 1.028 form Random forest a better or worst result.
Actually a great video sir, now we have trained the model right?, now if i want to use this model so i just need to copy paste the name of this jupyter file right??
Very cool video, much thanks for this. In addition to the video, I would like to know as a final part : how can we apply this model in a real life case ? Where do I put my python code ? How to integrate it ?
So, if we have experimental data already trained and tested, does this mean we can predict the solubility of the new molecule yet to be synthesized? Thanks for simplifying it.
Great video , I thought the comparison at the end , would show the actual values from the testset vs the predicted values for the Y , could you explain the difference?
Crisp and clear. Thank you. I need help here I want to build a ML project, which converts text to Excel files, I have a huge volume of data set to train. How to start with this. Please share
🐍 Finally a simple video, although I had a book for all this I think the pacing was too slow to start me off. Now I am more interested in know why and how some of this code work and for what reason. This is my process on being able to really understand and process information for coding lol Thanks for the video!
This was really nice content i have been struggling from where to start for my project and this video has just given me the way thanks @Data professor😇😇😇
So how would I got about doing this but for strings? Like if I had something like: DOG - CAT - MONKEY And I want the result to be those tokens separated (they’re not guaranteed to be separated by a dash). How would I do that?
I think you should also define why random forest is used, what is the difference between linear regression and random forest. Why max_deoth is set to 2, and all the other details because these elements are primarily important for anyone to understand what is actually being done.
🐍My heart is pounding so hard! I can't believe it was that 'simple' to make a model!! It gives me so much confidence to start learning more and add more models to the code I've made
Stop SIMPING dude.
@@Samuel-ik5wp Stop hating
Try making this model without scikit learn.
Then you'll get the real taste of Machine learning
@@binarysaiyan9389what’s the best practice with or without
@@binarysaiyan9389🤣🤣🤣
Amazing- I really liked how casually you would think to add colors, trend line and straight away go and add a few lines to reflect in output. Shows how comfortable you’re- for a beginner like me- that was so instructive when you speak out your thoughts on the go.
After this tutorial, I can now start my ML Journey confidently. May God bless you Data Professor to continue doing this good work. Cheers
Thanks for the kind words :)
You literally give the brief on machine learning in a very simple and easy way
🎯 Key Takeaways for quick navigation:
00:00 📋 *Introduction to Building Your First Machine Learning Model in Python*
- Overview of building a machine learning model in Python using scikit-learn and Google Colab.
- Naming conventions and organizing Jupyter Notebooks.
01:12 📊 *Loading and Exploring the Dataset*
- Introduction to the delani dataset, which contains information about molecule solubility.
- Explanation of comma-separated values (CSV) format and dataset columns.
05:11 📦 *Data Preparation: Splitting Data into Features (X) and Target (Y)*
- How to separate the dataset into features (X) and the target variable (Y).
- Explanation of the training set and testing set split.
09:22 ⚙️ *Building a Linear Regression Model*
- Importing the Linear Regression model from scikit-learn.
- Training the model on the training dataset.
13:18 🧪 *Model Evaluation: Mean Squared Error and R-squared*
- Calculating Mean Squared Error (MSE) and R-squared for both training and testing sets.
- Organizing and presenting the results in a Pandas DataFrame.
22:13 🌲 *Building a Random Forest Regressor Model*
- Introduction to using the Random Forest Regressor model.
- Organizing notebook sections and headings for clarity in the notebook structure.
23:47 📊 *Building Regression Models*
- Explains the distinction between regression and classification models based on the nature of the target variable.
- Demonstrates how to create a random forest regressor with specified parameters.
- Covers the training process of the regression model.
25:01 📈 *Model Performance Evaluation*
- Discusses using mean squared error and R2 score for evaluating model performance.
- Shows how to apply these metrics to the random forest regression model.
- Emphasizes being cautious about typos in code.
26:08 🧷 *Combining Model Results*
- Explains the process of combining the results of linear regression and random forest regression models.
- Demonstrates how to concatenate the results into a single DataFrame.
- Provides tips on reindexing and organizing the combined data.
28:03 📊 *Data Visualization of Prediction Results*
- Introduces data visualization using matplotlib for comparing predicted and actual values.
- Guides in creating a scatter plot with labeled axes.
- Adds a trendline to the plot using numpy's polyfit function.
29:56 🎉 *Conclusion and Further Exploration*
- Summarizes the process of building a machine learning model in Python using scikit-learn.
- Encourages viewers to explore different models, tweak learning parameters, and refer to scikit-learn's documentation.
- Requests viewers to share their model-building experiences in the comments.
Made with HARPA AI
“ Underrated comment “ -every bot comment ever
(But still a good comment with no likes)
I didn't have any prior knowledge of Data Science or Machine Learning, but as a visual learner,
I finally understand the purpose of the mathematical equation y = f(X).
Initially, after watching a couple of videos and starting from a math tutorial, I was confused about the relevance of math in this field.
But now I see its importance, and I am grateful for this new understanding.
Thanks @DataProfessor
This is the clearest ML Tutorial I’ve ever watched ❤❤❤
One of the best beginner videos available on YT😍
This video is amazing! You explain things so clearly, and the quality is excellent.
I usually don't spend time commenting on youtube, but dude this is a great video! Easy to follow alone, and very helpful. Thank you
Glad to hear, I appreciate that!
This was the easiest video on the ML model. Thanks Prof.
🐍This was my first video on ML and you made it really easy to follow through. I did this because I wanted to feel how it is like to do this , and I like it. I will definitely follow this through and study the actual maths behind it. Thank you for keeping me motivated.
This is the best machine learning tutorial that i've ever come across !❤
Simple, Clean and Informative !
Thank you sir ! 🙏
Thanks so much for the kind words :)
You're a very good teacher. Taught a complex topic so simply
Thanks for the kind words 😊
I am really excited about your channel. So this is the day first for me to start harvest from your channel. For sure i will not forget to support! 🙃😊
Excellent video. You explained everything very simple. Which i learnt 1 month in classroom you just explained in few minutes. It is quick recap for me. Thank you..
🐍Yo, this is my first ML project ever. Thank you. Very clear and concise. I need to go over this again and again and learn from this as well as other tutorials and courses.
I have been spending the past 4 months trying to actually do something tangible but haven't found anything practical like this one
Just took an hour and half I belive and I am done with my first mach ine learning modle
This has enhanced my confience and now I am going to build more
Thankf Professor
Awesome as always.. I'm applying this to my QSAR dataset in a bit.. Thanks for churning out great n useful content as always.. You d bestest Data Professor.
Glad it was helpful! And thanks for the kind words :)
🐍 This was SO helpful. I feel like I had just been going through the motions in that process before, but you explained WHY each step is done. And I will definitely be using headings from now on to create that table of contents. Thanks, Data Professor, for another great video!
Glad it was helpful! Yes, using headings and sub-headings helps my future self to skim through the code fairly quickly :)
Thank you so much for this video it is very helpful and specially you explain every line by line how code actually work.
This is better than my college sir
Hi ! Can you please mention in one line what exactly we found at the end I see it shows linear regression but what exactly does it depicts?
before this video, Machine learning was sorcery and scary. Not anymore. Thank you for your neat and casual explanations that make me love the science and the math behind those functions. I am really looking forward to giving it a shot and learning more. thanks a lot. like and sub-earned
Glad to hear that this video was helpful!
The best ML tutorial i just watched. Hat's off for your work sir. I really enjoyed and explanations were really great. Thanks again. You earned a subscriber here. keep up the great work sir.
You're very welcome!
🐍 great video, simple and direct approach, simple models, simple problem. It'd be nice to have more of those simple and direct videos with simple classification and clustering problems, just to get a grasp on them :D
Great suggestion! More in the pipeline
@@DataProfessor The "Titles" organization inside the notebook really helped to make it more fluid!
🐍 Thanks for this. I appreciate you not just going straight into the core material, but talking about the organization with the headers.
The only video that I understood about ML. Thank you!
🐍 Your explanation was clear, concise, and appreciated! You are an amazing teacher who hits all the important points without adding any confusion. Thank you for your time in doing this!
Wow, thank you for the kind words! Glad to hear that the video is helpful!
🐍 awesome video. This one was so much easier to follow along compared to other tutorials I have watch.
Glad it was helpful!
This was awesome, thanks to your knowledge I feel I can have fun with experimenting with different algorithms and its parameters.🐍
You made the process very simple for us! Great Video! Keep up the good Work!
Thank you for this .
This video is helping me build my first model
this is literally the first machine learning that i have built
looking to take it further
Thanks for the tutorial
🐍🐍🐍🐍🐍🐍🐍🐍🐍🐍🐍🐍🐍🐍
Thanks a lot! Great tutorial, I love how you explain every step of it so it is easy to understand what you are doing.
🐍
Great video Data Professor!
I really liked this tutorial because you didn't miss explaining terms you used without making any assumptions
Thanks! 😃
🐍Amazing depth covered in a short time-frame! Thank-you for compiling this introductory ML tutorial!
Excellent video to help understand the basic behind training new models. Just one suggestion, I missed a little more explanation on how the different parameters of the training algorithm would affect the end result.
Thank you so much bro this is the clearest explanation I have seen 🐍.
The explanations are so easy to understand.
🐍Amazing intro to ml. It really felt easy to build my first ml model
This is amazing Data Professor. Hope to see more of this stuff.
Thank you so so much for this tutorial, your videos deserve much more views! Could you please also do a video explaining how to do k-cross validation as well as the methods to measure and compare the accuracy of ML models (confusion matrix, F1 score ect.)? Thank you so much for your videos I learn much more from you than my other profs at the university!
Sure thing! Thanks for the suggestion. Also glad the videos are helpful :)
Thank you so much for this tutorial. Great to follow and to get an idea of how to actually build a model
Great Tutoring from the Data Professor.
Glad you think so!
Hi! Great video. A quick question... when do we use linear regression as in statistics and when do we use it as a ML algorithm? Both try to find coefficients, both try to minimize the error (however it is defined) but I am still confused on what is the difference between the classical linear regression I learnt in basic algebra course and the ML procedure. Thanks!
Wow! Thank you for a simple, clear explanation 👏👏
Thank You , May God bless you to continue doing this Great work.
🐍THanks for such an excellent introduction to running ML models in python - so clearly explained and super helpful tips on working within notebooks!
Python really is a whole different beast.. Just to accomplish the task we used numpy, pandas, sklearn.model_selection, sklearn.linear_model, sklearn.metrics, sklearn.ensemble, and matplotlib.
That's 7 libraries.. So overwhelming...
Thanks for the guide. I did it locally within VSCode, so had to keep tweaking and installing libraries along the way, but it worked out well. The ML journey begins. Thanks for the training wheels!
TY you saved me for training our model for my Final Year Project
Can you suggest me what are the pre-requisites that i should know before jumping into this?
Great video ! After making the model can we deploy this to AWS? or do we need to reprogram in AWS again?
🐍 thank you for the great tutorial of the building machine learning model.
Liking the video so far, but at 15:20, I’m getting a “name ‘X_train’ is not defined” error? Also it’s saying “name ‘X’ not defined” in the data splitting section
Apparently I had an extra empty code block right above that caused this issue -.-
It's super beginners friendly😂 thank you!!
I really love this video very explicit and well thought !!❣❣
Hi! could you please explain what the data is about, like what logS stands for?
Brilliant! Extremely helpful... Thank you!
very good video. cleared most of my fears about building ML models
🐍wow!!thanks...sticking around you definitely will get me throught my internship with flying colours!
🐍 Thank you so much. I can now dive into ML with confidence.
Great video straight foward and clear explanations.
What is the difference between the model (linear regression) developed by Google Colab and Microsoft Excel? Here you quoted that the model developed is using a machine learning model (using Colab), can we say the model developed using Mircosoft Excel is also a Machine Learning model?
This was a big help, thank you! Question: what's the significance of the value chosen for random_state? I get that you need a number so that you'll get the same split each time, but I've seen some people use random_state=100 and some use random_state=200. I don't understand the difference. If anyone could shed a little light on that for me, I'd be very grateful. Thanks
Hi Dalton, random_state is a parameter that specifies the randomness used by the function of interest whether it be data splitting or ML model building. A specific number would split the data in a certain subset while an ML algorithm may apply a certain random value as the initial weight values. To get the exact same results we would need to use the same number, however if the random_state number changes we could think of this as like shuffling a deck of card to get a different subset of data or a different starting point for the weight values.
@@DataProfessor OK I thiiiink I got it...random_state splits the data into subsets of the specified size? Why does it need to be split if you're training/testing the entire data frame? Sorry, I'm just brand new to this. Any recommendations you have straightforward reading on this somewhere online so that I'm not bombarding you with super basic questions would also be good. Thanks again!!!
How would you go about determining how accurate your model is against the test dataset?
Do you have the video how to check precison, f1 score, accuracy ?
Anyway u have a dataset for clap sound like single, double and triple that makes a function play/pause, previous and next in the music player pls help me with this anyone
🐍Thanks for this video. It will help me move forward in my final project in the university. Keep up the Good work. God bless you🙏.
The part I did not unterstand was the evaluation. For example. In the end we got a Training MSE fo Liniar regression of 1.007. Is the good or bad? Is the 1.028 form Random forest a better or worst result.
This is super. Thank you so much. Keep it up the good work...
Actually a great video sir, now we have trained the model right?, now if i want to use this model so i just need to copy paste the name of this jupyter file right??
Can you make video of feature selection methods for QSAR in big dataset
Sounds like a great topic for a future video
Best video I have find on youtube. Have you created playlists related to the AI, ML, NN 🤔
Great start👌
Thank you, I love this video. I am new to ML.
Very thorough in your thoughts and explanation.
Question: What do you do with the prediction afterwards?
🐍 this video got me through a few projects. Thank you!
🐍 Great way of explaining machine learning! Made it really easy to understand!
already so much better than datascience 365 tutorials!
Very cool video, much thanks for this. In addition to the video, I would like to know as a final part : how can we apply this model in a real life case ? Where do I put my python code ? How to integrate it ?
So, if we have experimental data already trained and tested, does this mean we can predict the solubility of the new molecule yet to be synthesized?
Thanks for simplifying it.
🐍Wow. This is my first time witnessing how models can be created. Great video as well.
We do prediction on test data . Is it that we do it on training data also
Great video , I thought the comparison at the end , would show the actual values from the testset vs the predicted values for the Y , could you explain the difference?
WOW....very interesting and very helpful....Thank you
Wish you would explain the training mods you use while explaining how to do this.
Crisp and clear. Thank you.
I need help here
I want to build a ML project, which converts text to Excel files, I have a huge volume of data set to train. How to start with this. Please share
To convert text to Excel files in Python, you can use pandas and some regular expressions
@@DataProfessor thank you !
Great video! I enjoyed a lot and learned a lot. Many thanks for that.
Very nicely explained. But you should have explained the outputs more, like what does that chart mean shown at the end?
thanx sir
I loved to learn this I'll go create my own model now
Amazing tutorial! Thank you very much!!!
🐍
Finally a simple video, although I had a book for all this I think the pacing was too slow to start me off.
Now I am more interested in know why and how some of this code work and for what reason.
This is my process on being able to really understand and process information for coding lol
Thanks for the video!
Can you make a video on prediction on the basis of 2 separate CSV files of train and test. I am not able to find any videos for that on TH-cam.
This was really nice content i have been struggling from where to start for my project and this video has just given me the way thanks @Data professor😇😇😇
Glad it was helpful!
Amazing explanation, thanks
I want to learn how to create a simple but useful neural network in my home lab . Do you have any video or ideas to share?
So how would I got about doing this but for strings? Like if I had something like:
DOG - CAT - MONKEY
And I want the result to be those tokens separated (they’re not guaranteed to be separated by a dash). How would I do that?
I think you should also define why random forest is used, what is the difference between linear regression and random forest. Why max_deoth is set to 2, and all the other details because these elements are primarily important for anyone to understand what is actually being done.
i just have a doubt, why do we need also to predict the training set?
Is the the testing to training ratio always 80 or is there some reasoning behind that specific ratio
It's an arbitrary number but a popular one at that, probably because of the Pareto 80/20
Thanks professor 🤝