HI Mark - I'm not entirely sure what you're asking here, please reply if I've missed the point. The back of the napkin math (though you can customize it in the function parameters) is to put a target number of clusters into the k means function for a given State/Province, such that the average cluster would be around 10-60% of the ideal territory size. We've experimented with a variety of thresholds and found that in more customer dense areas, more clusters are better, so we might target more like 10-15% (i.e. 8-10 clusters per territory). In less dense areas, it might be more 2-3. Think about if your target territory size is 3000 schrute-bucks (or your favorite fictional currency), and we've got 18,000 schrute-bucks in a region we want to model. We're aiming for 18000 / 3000 ( = 6) territories. We might want 6 clusters pre territory, since it's a moderately customer dense area. 6 clusters * 6 territories = 36 target output clusters. I hope that makes sense? Please let me know if I can help clarify. -Hunter
@@hunterbarcello2658 ok, so the script is optimizing the k means for k number of clusters per state based on “size” of state. In you example we have 2 clusters for states with no index and 10 clusters based on the calc for states like New York.
Hi Sahil - This is Hunter from the video - you can check out the materials used for this demo (and the slides) on my github page here github.com/hbarcello/TC19_TableauSalesOps
@@hunterbarcello2658would you be able to talk a little bit about the methodology between the clustering? I 've been playing around with different numbers of "territories" as an input into the script but it's still not really optimized. You'll have huge pockets in the midwest for example that are low volume that you'd think it would merge into one cluster. Am I wrong to assume that this doesn't really optimize by your chosen index, but rather splits the zipcodes in more manageable chunks for manual merging?
@@Orholam5 Yeah, as you've noticed, it's not a soup to nuts optimization solution. What we have seen is that we had a lot of management who wanted a lot of customized territories with very specific parameters, but we wanted to be able to put the postal codes in more manageable units (allocating 1,000 instead of 43,000) and that was the end goal of that little script/tool. I've played around a lot with building a full optimizer for this, but in the end, it often takes quite a while to run and still isn't exactly the output I wish I had.
@@hunterbarcello2658 thanks for the reply. It’s a bummer this isn’t easier to do. Deciding the total number of territories can be tricky and I really didn’t want to have to use an ancient beast like Terralign for optimization
What is your k-means max cluster for the zip code balance before you manually group the territories?
HI Mark - I'm not entirely sure what you're asking here, please reply if I've missed the point.
The back of the napkin math (though you can customize it in the function parameters) is to put a target number of clusters into the k means function for a given State/Province, such that the average cluster would be around 10-60% of the ideal territory size. We've experimented with a variety of thresholds and found that in more customer dense areas, more clusters are better, so we might target more like 10-15% (i.e. 8-10 clusters per territory). In less dense areas, it might be more 2-3.
Think about if your target territory size is 3000 schrute-bucks (or your favorite fictional currency), and we've got 18,000 schrute-bucks in a region we want to model. We're aiming for 18000 / 3000 ( = 6) territories. We might want 6 clusters pre territory, since it's a moderately customer dense area. 6 clusters * 6 territories = 36 target output clusters.
I hope that makes sense? Please let me know if I can help clarify.
-Hunter
@@hunterbarcello2658 ok, so the script is optimizing the k means for k number of clusters per state based on “size” of state. In you example we have 2 clusters for states with no index and 10 clusters based on the calc for states like New York.
Hi team, great work. I want to understand the index calculations and your R script better. Is there any material i can read up ?
Hi Sahil - This is Hunter from the video - you can check out the materials used for this demo (and the slides) on my github page here github.com/hbarcello/TC19_TableauSalesOps
@@hunterbarcello2658 Thank you Hunter, that is extremely generous!
@@hunterbarcello2658would you be able to talk a little bit about the methodology between the clustering? I 've been playing around with different numbers of "territories" as an input into the script but it's still not really optimized. You'll have huge pockets in the midwest for example that are low volume that you'd think it would merge into one cluster. Am I wrong to assume that this doesn't really optimize by your chosen index, but rather splits the zipcodes in more manageable chunks for manual merging?
@@Orholam5 Yeah, as you've noticed, it's not a soup to nuts optimization solution. What we have seen is that we had a lot of management who wanted a lot of customized territories with very specific parameters, but we wanted to be able to put the postal codes in more manageable units (allocating 1,000 instead of 43,000) and that was the end goal of that little script/tool. I've played around a lot with building a full optimizer for this, but in the end, it often takes quite a while to run and still isn't exactly the output I wish I had.
@@hunterbarcello2658 thanks for the reply. It’s a bummer this isn’t easier to do. Deciding the total number of territories can be tricky and I really didn’t want to have to use an ancient beast like Terralign for optimization