52
371 542

Channel Update #1: My Goal for this Channel

8:55

Naive Bayes from Scratch in Python: 2. Step of the Algorithm

20:42

Naive Bayes from Scratch in Python: 1. Step of the Algorithm

28:41

Naive Bayes from Scratch in Python: Introduction

11:32

Naive Bayes explained: Why "naive"?, the problem of rare values, continuous features, regression

19:33

Naive Bayes explained

33:18

Channel Update #2: Finishing 5 Kaggle Competitions in the Top 10%

In the first video of the “Channel Update” playlist, I talked about my goal for my TH-cam channel which was to gain mastery at data science. And now, in this video, I want to talk about the first big project that I want to undertake in the context of pursuing that goal.
Links:
- Corresponding blot post (and slides): www.sebastian-mantey.com/blog/finishing-5-kaggle-competitions-in-the-top-10
- Titanic Competition: www.kaggle.com/c/titanic
- House Prices Competition: www.kaggle.com/c/house-prices-advanced-regression-techniques
Timestamps:
0:00 - Intro
0:22 - Defining the Scope of the Project
2:01 - Road Map for completing the Project
6:52 - Prediction for the Number of Subscribers

มุมมอง: 453

วีดีโอ

Channel Update #1: My Goal for this Channel

8:55

Channel Update #1: My Goal for this Channel

มุมมอง 2714 ปีที่แล้ว

This is the first video of my “Channel Update” playlist. And in this video, I want to talk about what my goal for this channel is and, more importantly, why I want to pursue it. Links: - Corresponding blot post (and slides): www.sebastian-mantey.com/blog/my-goal-for-my-youtube-channel - "Drive" by Daniel H. Pink: www.amazon.com/Drive-Surprising-Truth-About-Motivates/dp/1594484805/ref=sr_1_1?dch...

Naive Bayes from Scratch in Python: 2. Step of the Algorithm

20:42

Naive Bayes from Scratch in Python: 2. Step of the Algorithm

มุมมอง 1.1K4 ปีที่แล้ว

In this series, we are going to code a Naive Bayes classifier from scratch in Python. And in this video, we are going to write the code that is going to execute the second step of the Naive Bayes algorithm. Links: - Corresponding blog post: www.sebastian-mantey.com/code-blog/coding-a-naive-bayes-classifier-from-scratch-python-p3-second-step-of-the-algorithm - GitHub repo: github.com/SebastianMa...

Naive Bayes from Scratch in Python: 1. Step of the Algorithm

28:41

Naive Bayes from Scratch in Python: 1. Step of the Algorithm

มุมมอง 3.1K4 ปีที่แล้ว

In this series, we are going to code a Naive Bayes classifier from scratch in Python. And in this video, we are going to write the code that is going to execute the first step of the Naive Bayes algorithm. Links: - Corresponding blog post: www.sebastian-mantey.com/code-blog/coding-a-naive-bayes-classifier-from-scratch-python-p2-first-step-of-the-algorithm - GitHub repo: github.com/SebastianMant...

Naive Bayes from Scratch in Python: Introduction

11:32

Naive Bayes from Scratch in Python: Introduction

มุมมอง 2K4 ปีที่แล้ว

In this series, we are going to code a Naive Bayes classifier from scratch in Python. And this video serves as an introduction. I will present the data that we are going to be working with, namely the Titanic data set. And I will also give a short recap of how the Naive Bayes algorithm works. Links: - Corresponding blog post: www.sebastian-mantey.com/code-blog/coding-a-naive-bayes-classifier-fr...

Naive Bayes explained: Why "naive"?, the problem of rare values, continuous features, regression

19:33

Naive Bayes explained: Why "naive"?, the problem of rare values, continuous features, regression

มุมมอง 5344 ปีที่แล้ว

In this video, we are going to cover some additional points about the Naive Bayes algorithm, namely: What’s “naive” about Naive Bayes, how to handle the problem of rare values, how to handle continuous features and classification vs. regression. Links: - Corresponding blog post (and slides): www.sebastian-mantey.com/theory-blog/naive-bayes-algorithm-explained-p2 - Kaggle Titanic Data Set: www.k...

33:18

Naive Bayes explained

มุมมอง 1K4 ปีที่แล้ว

In this video, we are going to cover how the Naive Bayes algorithm works. Therefor, we are going to use the well-known Titanic data set. Links: - Corresponding blog post (and slides): www.sebastian-mantey.com/theory-blog/naive-bayes-algorithm-explained-p1 - Kaggle Titanic Data Set: www.kaggle.com/c/titanic - Data that I already prepared: github.com/SebastianMantey/Naive-Bayes-from-Scratch - "Na...

18:23

Post-Pruning from Scratch in Python p.3

มุมมอง 2.2K4 ปีที่แล้ว

In this video, we continue working on a post-pruning algorithm from scratch. Links: - GitHub repo: github.com/SebastianMantey/Decision-Tree-from-Scratch - Corresponding blog post: www.sebastian-mantey.com/code-blog/coding-a-decision-tree-from-scratch-python-p14-post-pruning-from-scratch-3 - "Decision Tree from Scratch" playlist: th-cam.com/video/y6DmpG_PtN0/w-d-xo.html

21:20

Post-Pruning from Scratch in Python p.2

มุมมอง 2.1K4 ปีที่แล้ว

In this video, we continue working on a post-pruning algorithm from scratch. Links: - GitHub repo: github.com/SebastianMantey/Decision-Tree-from-Scratch - Corresponding blog post: www.sebastian-mantey.com/code-blog/coding-a-decision-tree-from-scratch-python-p13-post-pruning-from-scratch-2 - "Decision Tree from Scratch" playlist: th-cam.com/video/y6DmpG_PtN0/w-d-xo.html

19:41

Post-Pruning from Scratch in Python p.1

มุมมอง 6K4 ปีที่แล้ว

In this video, we are going to start coding a post-pruning algorithm from scratch. Links: - GitHub repo: github.com/SebastianMantey/Decision-Tree-from-Scratch - Corresponding blog post: www.sebastian-mantey.com/code-blog/coding-a-decision-tree-from-scratch-python-p12-post-pruning-from-scratch-1 - Decision Tree Pruning explained: th-cam.com/video/u4kbPtiVVB8/w-d-xo.html - "Decision Tree from Scr...

Decision Tree Pruning explained (Pre-Pruning and Post-Pruning)

17:32

Decision Tree Pruning explained (Pre-Pruning and Post-Pruning)

มุมมอง 44K4 ปีที่แล้ว

In this video, we are going to cover how decision tree pruning works. Hereby, we are first going to answer the question why we even need to prune trees. Then, we will go over two pre-pruning techniques. And finally, we will see how post-pruning works. Links: - Corresponding blog post: www.sebastian-mantey.com/theory-blog/decision-tree-algorithm-explained-p4-decision-tree-pruning - Post-Pruning ...

Git Tutorial with Python p.5 - Working with a Remote Repo on GitHub

19:15

Git Tutorial with Python p.5 - Working with a Remote Repo on GitHub

มุมมอง 8764 ปีที่แล้ว

In this video series we are going to cover the basics of Git. And in this video, we will see how we can work with a remote repo on GitHub. However, we will only see how to make use of the remote as an individual and not how to use it for collaboration in a team. Links: - Corresponding blog post (and slides): www.sebastian-mantey.com/posts/git-tutorial-with-python-p4-working-with-a-remote - GitH...

Git Tutorial with Python p.4 - Working with Branches

28:43

Git Tutorial with Python p.4 - Working with Branches

มุมมอง 5404 ปีที่แล้ว

In this video series we are going to cover the basics of Git. And in this video, we will how see three different types of merges while using branches, namely a fast-forward merge, a three-way merge without merge conflict and a three-way merge with merge conflict. We will then also see how to resolve the merge conflict. Links: - Corresponding blog post (and slides): www.sebastian-mantey.com/post...

Git Tutorial with Python p.3 - Creating Commits

26:14

Git Tutorial with Python p.3 - Creating Commits

มุมมอง 3624 ปีที่แล้ว

In this video series we are going to cover the basics of Git. And in this video, we will how we can use the staging area to create commits that represent a meaningful change for our project. Links: - Corresponding blog post (and slides): www.sebastian-mantey.com/posts/git-tutorial-with-python-p3-creating-commits - GitHub repo: github.com/SebastianMantey/Git-Tutorial - "Git Tutorial" playlist: t...

Git Tutorial with Python p.2 - Inspecting the Commit History of an already existing Project

14:58

Git Tutorial with Python p.2 - Inspecting the Commit History of an already existing Project

มุมมอง 5054 ปีที่แล้ว

In this video series we are going to cover the basics of Git. And in this video, we will inspect the commit history of an already existing project, namely a command-line based Tic-Tac-Toe game. Links: - Corresponding blog post (and slides): www.sebastian-mantey.com/posts/git-tutorial-with-python-p2-inspecting-the-commit-history-of-an-existing-project - GitHub repo: github.com/SebastianMantey/Gi...

Git Tutorial with Python p.1 - Key Concepts (Version Control, Commits, Branches, Remote Repo etc.)

30:15

Git Tutorial with Python p.1 - Key Concepts (Version Control, Commits, Branches, Remote Repo etc.)

มุมมอง 7924 ปีที่แล้ว

Git Tutorial with Python p.1 - Key Concepts (Version Control, Commits, Branches, Remote Repo etc.)

Conda Tutorial (Python) p.2: Commands for Managing Environments and Packages

18:16

Conda Tutorial (Python) p.2: Commands for Managing Environments and Packages

มุมมอง 7K5 ปีที่แล้ว

Conda Tutorial (Python) p.2: Commands for Managing Environments and Packages

Conda Tutorial (Python) p.1: Package and Environment Manager | Anaconda vs. Miniconda | Conda

12:00

Conda Tutorial (Python) p.1: Package and Environment Manager | Anaconda vs. Miniconda | Conda

มุมมอง 17K5 ปีที่แล้ว

Conda Tutorial (Python) p.1: Package and Environment Manager | Anaconda vs. Miniconda | Conda

Coding a Decision Tree from Scratch in Python p.11: Regression from Scratch

26:21

Coding a Decision Tree from Scratch in Python p.11: Regression from Scratch

มุมมอง 4.4K5 ปีที่แล้ว

Coding a Decision Tree from Scratch in Python p.11: Regression from Scratch

Coding a Decision Tree from Scratch in Python p.10: Regression explained

22:47

Coding a Decision Tree from Scratch in Python p.10: Regression explained

มุมมอง 4.5K5 ปีที่แล้ว

Coding a Decision Tree from Scratch in Python p.10: Regression explained

Coding a Random Forest from Scratch in Python p.3: Creating the Forest and making Predictions

16:10

Coding a Random Forest from Scratch in Python p.3: Creating the Forest and making Predictions

มุมมอง 6K5 ปีที่แล้ว

Coding a Random Forest from Scratch in Python p.3: Creating the Forest and making Predictions

Coding a Random Forest from Scratch in Python p.2: Bootstrapping and Random Subspace Method

12:44

Coding a Random Forest from Scratch in Python p.2: Bootstrapping and Random Subspace Method

มุมมอง 8K5 ปีที่แล้ว

Coding a Random Forest from Scratch in Python p.2: Bootstrapping and Random Subspace Method

Coding a Random Forest from Scratch in Python p.1: Random Forest Algorithm explained

9:16

Coding a Random Forest from Scratch in Python p.1: Random Forest Algorithm explained

มุมมอง 24K5 ปีที่แล้ว

Coding a Random Forest from Scratch in Python p.1: Random Forest Algorithm explained

Coding a Decision Tree from Scratch in Python p.9: Code Update

14:53

Coding a Decision Tree from Scratch in Python p.9: Code Update

มุมมอง 5K5 ปีที่แล้ว

Coding a Decision Tree from Scratch in Python p.9: Code Update

Basics of Deep Learning Part 15: Coding a Neural Network from Scratch in Python

35:18

Basics of Deep Learning Part 15: Coding a Neural Network from Scratch in Python

มุมมอง 1.7K5 ปีที่แล้ว

Basics of Deep Learning Part 15: Coding a Neural Network from Scratch in Python

Basics of Deep Learning Part 14: How to train a Neural Network

22:52

Basics of Deep Learning Part 14: How to train a Neural Network

มุมมอง 1.5K5 ปีที่แล้ว

Basics of Deep Learning Part 14: How to train a Neural Network

Basics of Deep Learning Part 13: Implementing the Backpropagation Algorithm with NumPy

7:32

Basics of Deep Learning Part 13: Implementing the Backpropagation Algorithm with NumPy

มุมมอง 8K5 ปีที่แล้ว

Basics of Deep Learning Part 13: Implementing the Backpropagation Algorithm with NumPy

Basics of Deep Learning Part 12: Backpropagation explained Step by Step cont'd

16:38

Basics of Deep Learning Part 12: Backpropagation explained Step by Step cont'd

มุมมอง 1.2K5 ปีที่แล้ว

Basics of Deep Learning Part 12: Backpropagation explained Step by Step cont'd

Basics of Deep Learning Part 11: Backpropagation explained Step by Step cont'd

13:53

Basics of Deep Learning Part 11: Backpropagation explained Step by Step cont'd

มุมมอง 1.3K5 ปีที่แล้ว

Basics of Deep Learning Part 11: Backpropagation explained Step by Step cont'd

Basics of Deep Learning Part 10: Backpropagation explained Step by Step cont'd

17:19

Basics of Deep Learning Part 10: Backpropagation explained Step by Step cont'd

มุมมอง 1.2K5 ปีที่แล้ว

Basics of Deep Learning Part 10: Backpropagation explained Step by Step cont'd

ความคิดเห็น

@nrted3877 หลายเดือนก่อน
you look like brad pitt, awseome video
@ashutoshsinghai713 2 หลายเดือนก่อน
Really like your all the videos, I always look for videos where things are explained with examples, as well as implementation from scratch. And All your videos fulfils my expectations. Thank you
@SebastianMantey 2 หลายเดือนก่อน
Thanks! Nice to hear!
@sethwilliams501 2 หลายเดือนก่อน
thank you friend
@lucutes2936 2 หลายเดือนก่อน
pessimistic vs optimistic pruning?
@affanahmedkhan7362 2 หลายเดือนก่อน
Phenomenal job ❤❤❤❤❤
@affanahmedkhan7362 2 หลายเดือนก่อน
جزاکم اللہ خیرا بھائی
@kabeersekhri 2 หลายเดือนก่อน
I am getting error 'dict' is not callable when calling getbestsplit. How do i fix that
@izb1275 7 หลายเดือนก่อน
Thanks for the tutorial finally I understand pre-pruning and post-pruning!
@FloraInn 8 หลายเดือนก่อน
Thank you so much for this excellent series! This helps me a lot.
@abdallahmohamed8629 9 หลายเดือนก่อน
thank you you are the best teacher i have ever seen as i take long time to understand it but i got it with you clearly so don't stop please ❤
@FaktariaID 9 หลายเดือนก่อน
Why we use sigmoid? Isnt on the multi class problem we use softmax? Please answer please i really confusseeee
@SebastianMantey 9 หลายเดือนก่อน
Yes, that is correct, softmax would be better. However, my goal for this series was to cover just the very basics that one would need to make a neural net work. Therefore, I would consider softmax to be beyond the scope of basics since it’s a technique to improve the performance of the neural net (and not an element that you absolutely need to “just” make the neural net work).
@sachinkaushik8580 10 หลายเดือนก่อน
Wonderful video, thank you so much. I tried to do decision tree from scratch from myself and faced several issues, then I watched some TH-cam videos and those were too complicated. I gave up on coding decision tree by myself. but you made it super easy, the way you explained. I am extremely surprised why this video has only 146 likes, it deserves millions. Thanks a ton for your efforts!
@SebastianMantey 10 หลายเดือนก่อน
I’m glad that the videos were helpful to you. And thanks for the kind words!
@tushargoel3521 11 หลายเดือนก่อน
can you also provide us this sample data for decision tree
@abdulalinawrozie8070 ปีที่แล้ว
looks like there is no bias, how to do back-propagation of bias values ?
@tobiasdamaske1468 ปีที่แล้ว
Can only agree with what has been said so far. Such good content and easy to grasp explanations.
@SebastianMantey ปีที่แล้ว
Thanks! I appreciate it!
@dangpham1547 ปีที่แล้ว
Thank you so much, your contents deserve much more attention!
@luckytraderchow ปีที่แล้ว
exactly what i need! thank you
@SebastianMantey ปีที่แล้ว
Clarification on Information Gain vs Overall Entropy (17:45) The formula for Information Gain is basically like this: Information Gain = Entropy before split - weighted Entropy after split Or in other words: Information Gain = Entropy before split - Overall Entropy So, to determine the Entropy before the split, we need to calculate the following: Entropy before split = 42/130 * (-log2*42/130) + 42/130 * (-log2*42/130) + 46/130 * (-log2*46/130) = 1.584 So, the Information Gain for split 1 is: Information Gain = 1.584 - 0.646 = 0.938 And the Information Gain for split 2 is: Information Gain = 1.584 - 0.804 = 0.780 So, split 1 results in a higher Information Gain and we would chose it over split 2. Therefore, we get the same result compared to the video, where we just used Overall Entropy. The reason I decided to just use Overall Entropy and not Information Gain is because they are essentially the same. With Overall Entropy you focus on the fact that the entropy decreases from 1.584 to 0.646 after the split. And with Information Gain you focus on the fact that the entropy of 1.584 decreases by 0.646 to get an Information Gain of 0.938 after the split. In my opinion, using Overall Entropy is simply more intuitive. Additionally, it requires one less calculational step.
@relaxthegamer ปีที่แล้ว
:/ hey bro it seems like information gain and overall gain causes different split condition. If I use information gain the tree matches with sklearns Decision Tree algorithm (entropy).
@SebastianMantey ปีที่แล้ว
I don’t think that the reason you are getting different split conditions is because of the difference between information gain and overall entropy since they are essentially the same. For clarification, please have a look at my pinned comment of this video: th-cam.com/video/ObLQcpuLAlI/w-d-xo.htmlsi=fWJvJ7xElXx6KJ4a&t=1048
@eladiomendez8226 ปีที่แล้ว
Great explanation!
@marcelohuerta2727 ปีที่แล้ว
is this id3 algorithm?
@SebastianMantey ปีที่แล้ว
Please have a look at the comment from mustajabhussain9167 for the answer.
@ryantwemlow1798 ปีที่แล้ว
This is so helpful, thank you for making this!
@SebastianMantey ปีที่แล้ว
I’m glad it was helpful!
@tobe7602 ปีที่แล้ว
It's seem so simple when you do it, but for me need hours to understand your work so your work is truly impressive
@SebastianMantey ปีที่แล้ว
Thanks! But I also spent hours before recording the video. So, you are not alone. 😉
@MDRizwanulAmin ปีที่แล้ว
hi can anyone tell, where has that variable "overall_entropy=999" has come from at 20:35 ???? plz
@SebastianMantey ปีที่แล้ว
As I said in the video, it is just an arbitrary high number. And we need it because of the if-statement that follows later within the for-loops (“if current_overall_entropy <= overall_entropy:”). We need an initial value for the variable “overall_entropy”. Otherwise, we would get an error in the first iteration of the for-loops because “overall_entropy” gets assigned/updated within the if-statement. So, in the first iteration it would not exist yet. That being said, maybe a better way of writing the code would have been to use “np.inf” from NumPy. So, that there is not this magic number. Hope that helps!
@HansPeter-lx6qk ปีที่แล้ว
This videos are not famous enough. So good.
@SebastianMantey ปีที่แล้ว
Thanks!
@The_Computer_Guy ปีที่แล้ว
switch to dark mode powerpoints.
@hauntinglycelestial ปีที่แล้ว
thanks a lot for those videos!! these seriously just saved my grades in data science!!
@SebastianMantey ปีที่แล้ว
I'm glad that they were helpful!
@hopelesssuprem1867 ปีที่แล้ว
could you pls explain what type of pruning is it i.e. is it cost complexity pruning like in CART or something another and why did you decide to use this method?
@SebastianMantey ปีที่แล้ว
I am assuming you are referring to post-pruning: As I mention at 14:44, the process is called “Reduced Error Pruning”. And I used it simply because that’s the process that was described in the book I was using, namely “Fundamentals of Machine Learning for predictive data analytics”.
@hopelesssuprem1867 ปีที่แล้ว
@@SebastianMantey oo, thanks. Now I've understood everything.
@hopelesssuprem1867 ปีที่แล้ว
Overall_metric might be used as np.inf against True/False with 1'st iteration. In my ipinion it looks clearer on an intuitive level.
@SebastianMantey ปีที่แล้ว
Just to clarify, do you want to set “best_overall_metric = np.inf” at the beginning of the function and then check in the if-statement if “current_overall_metric <= best_overall_metric”? I think that should work.
@hopelesssuprem1867 ปีที่แล้ว
@@SebastianMantey yes, you understood correctly. I've also checked and it works. Now I'm trying to cope with regression tree. I used california housing dataset and I got an accuracy in 3% but sklearn model shows near 45%. I'll be checking where's a mistake😂.
@hopelesssuprem1867 ปีที่แล้ว
May I ask a question: why did u split data taken the average between two unique values? Is it a rule or may I split data just have been taking every unique value? I thought that in this case we shouldn't care about type of value (categorical and so on).
@SebastianMantey ปีที่แล้ว
No, it’s not a rule. I just did it because I based the code on my “Decision Tree Algorithm explained” video (th-cam.com/video/ObLQcpuLAlI/w-d-xo.html). By the way, in part 9 (th-cam.com/video/V8xZ5fIiTVw/w-d-xo.html) I update this function so that the potential splits are actually the unique values themselves. Hope that helps!
@hopelesssuprem1867 ปีที่แล้ว
@@SebastianMantey thanks for answer. Your explanation is the best I've seen and it helped me to create my own version of decision tree.
@hopelesssuprem1867 ปีที่แล้ว
Thank u for a great tutorial. Is it a CART version?
@SebastianMantey ปีที่แล้ว
I didn’t specifically look at the ID3 or CART algorithms, for example, and then tried to implement them in code. What I did was (as far as I can remember): I had a high-level understanding of how decision trees basically work. Then, I tried to explain the algorithm in simple terms and in my own words in these two videos: th-cam.com/video/WlGuizdVaiY/w-d-xo.html th-cam.com/video/ObLQcpuLAlI/w-d-xo.html And then, finally, from the diagram that you can see at 0:19 in part 1 of this series (th-cam.com/video/y6DmpG_PtN0/w-d-xo.html), I created the code from scratch by myself just using this particular diagram and no other references. Hope that answers your question.
@hopelesssuprem1867 ปีที่แล้ว
@@SebastianMantey thanks a lot
@HanJiang-hs5di ปีที่แล้ว
Hi Sebastian, thanks for creating such useful videos. I'm also wondering what I should do if I would like to know the number of observations and entropy value of each leaf node. Thank you!
@SebastianMantey ปีที่แล้ว
I’m glad that you found them helpful! That’s difficult to say because it would require a lot of little changes within the code. For example, you would probably need to change how the decision tree is structured, e.g. use a dictionary where the keys are “question”, “yes_answer”, “no_answer”, “entropy” and “n_observations”. And then, you would need to make the according adjustments in the code to provide the respective values for those keys. It's not a very specific answer, but maybe it helps a little bit.
@rajvaghela4378 ปีที่แล้ว
Hats off to you sir! Thank you for such a deep and detailed video for post pruning a decision tree!!
@SebastianMantey ปีที่แล้ว
Glad to hear that it was helpful!
@doggydoggy578 ปีที่แล้ว
Is this classification or regression ?
@vglez8088 ปีที่แล้ว
Hi Sebastian Great job! Listen, I tried your calculate_r_squared function () but there seems to be an error Basicly, python does not like your statement grid_search = pd.DataFrame() It's only when I set the grid_search name to df that the error vanishes, however then I have to set a loop for printing - in this case I write if max_depth = 6 then print(df) and I get a list of 17 items. [see below] This is not very elegant but works after all. Can you please explain explain why I get this error? Many thanks ViG df = pd.DataFrame(grid_search) df.sort_values("r_squared_val", ascending=False).head() dfs = df.groupby('max_depth') dfs_updated = [ ] for _, df in dfs: df['id'] = df.groupby(['max_depth','min_samples', 'r_squared_train', 'r_squared_val']).ngroup()+1 dfs_updated.append(df) df_new = pd.concat(dfs_updated) df_new if max_depth == 6: print(f"") print(df_new)
@SebastianMantey ปีที่แล้ว
It is hard to say without actually seeing the code. But I find it strange that the code would work just because you change the name of the variable. That doesn’t make sense to me since Python doesn’t care what the name of a variable is. Sorry!
@saisaratchandra7693 2 ปีที่แล้ว
hidden gem!
@maheshmadhavan9486 2 ปีที่แล้ว
Decision Tree is working fine with small dataset. Its not finishing a binary classification for data with 3 Lakhs rows. How can we reduce the iterations in determine_ best_split() ??
@SebastianMantey 2 ปีที่แล้ว
One thing that you could do, is to reduce the number of potential splits that are determined by the “get_potential_splits” function. For example, if the number of “unique_values” for a particular feature is greater than, let’s say, a thousand, then you could write the code in such a way that you just randomly pick a certain percentage (e.g. 20%) of all those splits. This way, you (probably) won’t include the best split, but at least you will include a split that is very close to the best split. And this percentage could then become another hyper-parameter for the decision tree algorithm, where you then have to trade-off speed versus accuracy. Hope that helps!
@Girly.tomboy123 2 ปีที่แล้ว
How to open a jupyter notebook from conda environment
@SebastianMantey 2 ปีที่แล้ว
stackoverflow.com/questions/58068818/how-to-use-jupyter-notebooks-in-a-conda-environment
@mathanimation7563 2 ปีที่แล้ว
why you stop making videos ,your videos is outstanding
@daljeetsinghranawat6359 2 ปีที่แล้ว
i have watched whole series on Basics of Deep Learning Part 1-15: Coding a Neural Network from Scratch in Python......got hooked to it and completed on binge.
@SebastianMantey 2 ปีที่แล้ว
Thanks, for the kind feedback! I appreciate that. I hope it was helpful.
@ahmadmponda3294 2 ปีที่แล้ว
This is indeed unique style from the entire youtube i've ever watched.Excellent OO methods.
@reemnasser9105 2 ปีที่แล้ว
I'm lucky to find your videos..Thanks for this effort :)
@redforestx7371 2 ปีที่แล้ว
SUPERB EXPLANATION! THANK YOU!
@Guiltyass 2 ปีที่แล้ว
What a great tutorial. But how can I visualize the random forest by plot?
@SebastianMantey 2 ปีที่แล้ว
To be frank: I don’t know, sorry! Someone also asked it before under my Decision Tree from Scratch series and then I looked a little bit into it. But it seemed too complicated to be worth the effort.
@Bryan-eg7si 2 ปีที่แล้ว
Awesome video . I have a school assignment and your videos are a literal life saver. Am trying to use your code for a multiple split classification problem that im working on now
@SebastianMantey 2 ปีที่แล้ว
Thanks! I'm glad my videos were helpful.
@dunjaorevic2382 2 ปีที่แล้ว
Really nice one! :D
@AayushRampal 2 ปีที่แล้ว
great video, i always wondered how decision trees are taking splits. like ifunique values are 2,3,4. does it split lik. e < and > 2.5,3.5 or does it use < & > 3,4 like that. are you doing it in same way like in sklearn ?
@SebastianMantey 2 ปีที่แล้ว
Thanks! You can do it either way. Here, I am using the middle between two unique values as a splitting point. Later on in the series (th-cam.com/video/V8xZ5fIiTVw/w-d-xo.html), however, I actually use the unique values as splitting points. With regards to sklearn, I actually don’t know how they exactly do it.
@thedanger49 2 ปีที่แล้ว
Hey Sebstian,you’re an amazing teacher thank you for sharing this informations with us , please can you help me ? I have a question ,how can we handle categorical columns that have many unique features ? I have a columns named « country « that have 174 unique value I want to use it to solve a classification problem using a decision tree please ,if a use on hot encoder I get too many columns and my tree became very big I can’t even see it how can I solve this problem please
@SebastianMantey 2 ปีที่แล้ว
Thanks! As I explained in part 8 (th-cam.com/video/5eoFajw8TWk/w-d-xo.html), the downside of this implementation is that the algorithm is pretty slow when there are features with many different unique values. However, as you can see in the video, there is a feature with 80 different values (“LIMIT_BAL”) and the algorithm is still fast. It only gets slow for the feature “BILL_AMT1” which has over 18,000 different values. Therefore, I would assume that, in your case, the algorithm is slow because you didn’t specify a “max_depth”, for example a max depth of 5 (since you also said that the tree is very big). So, I would suggest testing out different values for the “max_depth” parameter. This way, the algorithm is faster, and you will also get a better performing tree since a tree with many layers tends to overfit the training data anyway. Hope that helps!
@supriyakumarmishra1437 2 ปีที่แล้ว
Sir, could you please make a visual representation of the tree, it seems good when it is interactive.
@SebastianMantey 2 ปีที่แล้ว
Please have a look at my response to the comment from "Nicholas Neo" in part 8 (th-cam.com/video/5eoFajw8TWk/w-d-xo.html&lc=Ugy00Z86PRw77vtZgXZ4AaABAg) since he had a similar question.

Sebastian Mantey

ความคิดเห็น