Precision: Of the users who churned, how many did the model predict would churn Precision: छोड़ कर गए लोगों में से कितनों के बारे में model ने बताया था कि ये छोड़ के जा सकते हैं | Recall: Of the users the model said would churn, how many users actually churned? Recall: Model ने जिनको बताया था कि ये जा सकते हैं , उनमे से कितने वाकई में छोड़ कर चले गए |
I think you got the precision and recall swapped, precision = Tp/ tp + fp , Recall = Tp/ Tp + Fn , The false negatives + true positives in recall stands for the users who actually churned and the true positives + false positives in precisions are the predictions by the model!
@@CodeEmporium I loved this presentation - you're so good at touching the main points (e.g. data leakage) and all in 13 mins!! Exactly what I was looking for, well done!
Want to ask about perhpas too bias features. Isn't something like "days from last orders" or "total order count" would be too bias to predict churn. I mean it's common sense that if the customer hasn't ordered for a while or in a long time ordered once or twice, he is SUPER likely to churn when the churn period arrives. Wouldn't such features somehow hide or overshadow other features which perhaps could indicate us which features make negative churn predicitons ?
True. Like if the feature is 89 days since last purchase, chances are they will churn. And now that i think about it, it probably won't be useful making that prediction since acting on it is impossible. We could just exclude that as a feature and say "What is the probability the customer churns 3 months from making this prediction". Good eye
Awesome video! The step at 9:05 went a bit too fast in my opinion. Could you show us, perhaps with made up data, how you verify it using SQL? What goes into the process of defining a "Correct" and an "incorrect"?
Performing uni-variate analysis of predictor variable's relationship with target variable should answer the "hunches"/hypothesis. For instance, plotting a bar chart of Gender vs Churn will tell us whether males are likely or females are more likely to churn. In this case, we can visualize #WorkOrdersIn6Months against Churn
Wait why can we only train on data up to 3 months in the past? If you have the same features for customers from last year and also see their usage history last year, couldn't you use that historical data to predict churn as well?
Start porting your videos to LBRY's Odysee as well. You can set that to automatically upload it there whenever it uploads to TH-cam and can be activated when you create your Odysee account. You deserve way more exposure, maybe that can help a little since there is far less competition there.
how should we treat the test dataset? Does that also have to be randomly sampled? How would you present recall and precision? at the user level or daily observation level? Thanks
There are two points that you mentioned in the model training slides 1. Create snapshots randomly for active users at a given time 2. Only train on data up to 3 months in the past Does that mean that when I want to create a random snapshot, let's say of 18th Mar 2021, I only include customers who have purchased at least once in the last 3 months as anyone whose last purchase was 4 months prior to 18th Mar is already a churned and hence inactive violating the first point. Also of those users who are considered active, can we look back at data prior to 3 months or not. I am guessing not because that would violate rule 2 but then you are using a feature #work orders in last 6 months. So I am a little confused here.
Great video! I don’t think you could use regression though because of right censoring. You don’t know if some of those customers will churn and when. Survival analysis models deal with this kind of censoring.
Great video. However, I think that the precision/recall explanation at the end should be the other way around.
Good eye. I'll clear this up in a comment. Thanks for catching this
Came here to say this!
Aye
It would be great if you add a comment on the video itself or correct that section. Many may not read the comment section and thus not realize it.
@@bahmanaboulhasanzadeh8512 Exactly!! I almost did not read the comment section.
This is such a great video, thank you!
Precision: Of the users who churned, how many did the model predict would churn
Precision: छोड़ कर गए लोगों में से कितनों के बारे में model ने बताया था कि ये छोड़ के जा सकते हैं |
Recall: Of the users the model said would churn, how many users actually churned?
Recall: Model ने जिनको बताया था कि ये जा सकते हैं , उनमे से कितने वाकई में छोड़ कर चले गए |
Thank you for yet another awesome video! I like how you simplify data science concepts.
what is work order ???
Great video
I think you got the precision and recall swapped, precision = Tp/ tp + fp , Recall = Tp/ Tp + Fn , The false negatives + true positives in recall stands for the users who actually churned and the true positives + false positives in precisions are the predictions by the model!
Yep. I saw this in another comment too. Good eye! I'll write a clarifying comment on this soon
@@CodeEmporium I loved this presentation - you're so good at touching the main points (e.g. data leakage) and all in 13 mins!! Exactly what I was looking for, well done!
Can you provide examples of practices in tools related to churn, for example in Orange Data Maining?
I'm still learning to churn, thank you
Love it!
Want to ask about perhpas too bias features. Isn't something like "days from last orders" or "total order count" would be too bias to predict churn. I mean it's common sense that if the customer hasn't ordered for a while or in a long time ordered once or twice, he is SUPER likely to churn when the churn period arrives. Wouldn't such features somehow hide or overshadow other features which perhaps could indicate us which features make negative churn predicitons ?
True. Like if the feature is 89 days since last purchase, chances are they will churn. And now that i think about it, it probably won't be useful making that prediction since acting on it is impossible. We could just exclude that as a feature and say "What is the probability the customer churns 3 months from making this prediction". Good eye
@@CodeEmporium Then this would be related to Survival Analysis?
Awesome video! The step at 9:05 went a bit too fast in my opinion. Could you show us, perhaps with made up data, how you verify it using SQL? What goes into the process of defining a "Correct" and an "incorrect"?
Performing uni-variate analysis of predictor variable's relationship with target variable should answer the "hunches"/hypothesis. For instance, plotting a bar chart of Gender vs Churn will tell us whether males are likely or females are more likely to churn. In this case, we can visualize #WorkOrdersIn6Months against Churn
Great video. Love from pakistan
Awesome! Welcome aboard!
amazing
Wait why can we only train on data up to 3 months in the past? If you have the same features for customers from last year and also see their usage history last year, couldn't you use that historical data to predict churn as well?
I have the same question.
Thank you,
Could u please help me how can make a mathematical equation for churn model?
Start porting your videos to LBRY's Odysee as well. You can set that to automatically upload it there whenever it uploads to TH-cam and can be activated when you create your Odysee account. You deserve way more exposure, maybe that can help a little since there is far less competition there.
Thanks for the suggestion Daniel. I def might consider this
what is work order ???
how should we treat the test dataset? Does that also have to be randomly sampled? How would you present recall and precision? at the user level or daily observation level? Thanks
awesome
Thank youu
There are two points that you mentioned in the model training slides
1. Create snapshots randomly for active users at a given time
2. Only train on data up to 3 months in the past
Does that mean that when I want to create a random snapshot, let's say of 18th Mar 2021, I only include customers who have purchased at least once in the last 3 months as anyone whose last purchase was 4 months prior to 18th Mar is already a churned and hence inactive violating the first point.
Also of those users who are considered active, can we look back at data prior to 3 months or not. I am guessing not because that would violate rule 2 but then you are using a feature #work orders in last 6 months. So I am a little confused here.
I have the same questions.
Great video! I don’t think you could use regression though because of right censoring. You don’t know if some of those customers will churn and when. Survival analysis models deal with this kind of censoring.
Ah. Good call. I think you're right with this
Precision recall definition seems to be goofed up.
Yea. My b. Pinned a previous comment that said the same. Thanks!
Is this a Hancock 'good job'??
I don't agree with the 2nd feature.
still great vdo 👍