Ep. 4 - Marketing Mix Modeling: Analyzing Facebook Robyn One-Pager output models

แชร์
ฝัง
  • เผยแพร่เมื่อ 13 ก.ย. 2024

ความคิดเห็น • 25

  • @gabrielefranco3452
    @gabrielefranco3452 2 ปีที่แล้ว +3

    Great content, thanks for sharing

  • @jinqimao4781
    @jinqimao4781 ปีที่แล้ว +2

    Thanks for all information!

  • @javierlarramona5923
    @javierlarramona5923 2 ปีที่แล้ว +1

    Thanks a lot for the tutorials. Very well explained. And I got a pretty good accuracy which is nice ;)

  • @hybridamarketing6776
    @hybridamarketing6776 2 ปีที่แล้ว +3

    Awesome

  • @laresbernardo
    @laresbernardo 2 ปีที่แล้ว +4

    Loved these tutorials. Thanks a lot for sharing!
    2:29 You actually have more than 5 models in your pareto front-line but the clustering set to TRUE reduced them to 5 :)

    • @cassandra4533
      @cassandra4533  2 ปีที่แล้ว +2

      Really appreciate your comment Bernardo, thanks!
      About the clustering that's indeed true, it's something I left out as it was just recently released and was yet to fully get the logic behind it at the time of recording. Thanks for pointing that out :).

  • @user-bz5qv1jt4j
    @user-bz5qv1jt4j 5 หลายเดือนก่อน +1

    Thank you so much for these videos, I have a question on the first graph - Response decomposition waterfall
    I ran this example with PackageVersion==3.10.3 and my second biggest predictor of revenue in this chart is (intercept) whereas in this example it's the second last one. What is the (intercept)? I do not know how to explain or interpret this factor. Would you mind shedding some light?
    Thanks in advance!

    • @cassandra4533
      @cassandra4533  5 หลายเดือนก่อน

      Hey, glad you liked the series :)
      The most simplistic explanation you can use is: Intercept represents all those sales you'd do anyway, regardless of your paid and organic marketing activities.
      Contribution of the intercept varies a lot from business to business and it's much influenced by the amount of historical data as well as the length of the modeling windows selected. So if you feel like your intercept is not correctly attributed you might want to try and add more data as well as change the windows selected.

  • @sourishpal8286
    @sourishpal8286 9 หลายเดือนก่อน +1

    Well explained content, thanks for sharing!
    I want to understand that how are we getting ROI numbers for each channels here? 15:35

    • @cassandra4533
      @cassandra4533  9 หลายเดือนก่อน

      Hello, ROIs are generated by dividing the amount of Revenue attributed to each channel (by the model) by the total budget spent for each channel.

  • @guidocasco935
    @guidocasco935 6 หลายเดือนก่อน +1

    Hi mate, great content!
    Quick question, how would you evaluate from a business perspective the quality of the model?
    I understand that eventually you could test if the optimization results made sense after you've invested according to Robyn's recommendations. But is there a way to somehow evaluate the quality of the model ex-ante?
    Thanks!

    • @cassandra4533
      @cassandra4533  6 หลายเดือนก่อน +1

      Hey Guido, with newer versions of Robyn (compared to this video) we do have Confidence Intervals as well as an output.
      Typically we do look at a mix of things to ensure the model makes sense statistically wise including general KPIs of the model, Confidence Intervals, Adstocks and Contribution of Baseline vs Other Factors.
      If that looks good from a statistical point of view we do compare that with the business knowledge of the client itself which tipically means comparing with their known CPO/ROI up until that point just to make sure it's somewhere realistic (still assessing uncertainty for channels where the C.I. are too wide). We're not expecting ROI/CPO to match with their "known" ones (tipically cookie-based) obviously.
      From there it's a matter of both understanding why there are differences (and using data to validate hypothesis) and relying on experimentation as the definitive validation.

    • @guidocasco935
      @guidocasco935 6 หลายเดือนก่อน +1

      ​@@cassandra4533 Super clear mate! Thanks! 🙌

  • @abdelhakchahid2
    @abdelhakchahid2 ปีที่แล้ว +1

    Hi, Cassandra teams.
    Thank you for the excellent video. I would like to know how would you explain the TREND and INTERCEPT variables from the waterfall figure?

    • @cassandra4533
      @cassandra4533  ปีที่แล้ว +1

      Hello! Glad you are liking it :)
      Intercept can be defined as the amount of revenue/conversions that you would generate anyway even if you'd stop all of your marketing and organic activities today.
      Trend on the other hand derives the general long-term pattern in your data. If your brand has been consistently growing for the past 2 years increasing the overall volume of sales etc, you'll kind of see that in the Trend variable.

  • @Mvobrito
    @Mvobrito 2 ปีที่แล้ว +2

    In your experience with real data, what happen when some sources are much bigger than the others?
    For example, if Facebook and Google spend 20x more than the 3rd biggest source, the model will tend to overestimate or underestimate the impact of the bigger or the smaller sources?

    • @cassandra4533
      @cassandra4533  2 ปีที่แล้ว

      Great question! So far we are yet to face such an issue.
      However in a really extreme case such the one above, if the impact of the 3rd source is really small, you may want to actually remove that from the model.
      On the other hand, you may want to learn more about Robyn's calibration function which helps tune the model in order to be more precise about each channel's contribution.
      More info here: facebookexperimental.github.io/Robyn/docs/calibration

  • @Mvobrito
    @Mvobrito 2 ปีที่แล้ว +2

    Also, can we get the daily (or weekly in the example) predictions of the model by source?
    Foe example, plot a graph with date in the X axis and revenue or conversion in the Y axis, with one line for each source.

    • @cassandra4533
      @cassandra4533  2 ปีที่แล้ว

      As of today Robyn doesn't pack a function to do that automatically.
      However, by using the data provided in the pareto_alldecomp_matrix.csv file, you can plot such information by yourself.

  • @LuisGustavoFarro
    @LuisGustavoFarro 11 หลายเดือนก่อน +1

    the column sales of the competitors how they normally take it out. This is not something we are used to having, for example in retail.

    • @cassandra4533
      @cassandra4533  11 หลายเดือนก่อน

      Hey Luis, that is Project Robyn's demo dataset.
      Not sure how to get that in a real world scenario but you might use tools like SemRush or SimilarWeb to get an estimation of their Media Spending for instance which would work as well.

  • @meheh002
    @meheh002 6 หลายเดือนก่อน +1

    At 21:53 you say that if our business knowledge is different then we should abandon that particular model. What's the point of running it, if you get rid of it due to your past experience. Why not focus on the r^2, NRMSE, RSSD and MAPE values instead to select objectively?

    • @cassandra4533
      @cassandra4533  6 หลายเดือนก่อน

      Hey, great question, I'll try to simplify this to its core.
      When you run a model in Robyn you're getting a set of optimal models and that is because statistically wise it's impossible to pick only 1 model and say that's the overall best compared to all others.
      This plus the fact all those models will slightly vary from each another (one might favor more Variable 1 and another might favore more Variable 3) as of today still require a human input.
      That is for two reasons:
      1) Business-wise you want a model that your stakeholders will trust, if you present something that's far away from their current knowledge right from their start they'll just go "this doesn't make sense" and won't even consider it - On the other hand if you build trust first in the model with something close to their knowledge then you've room to make them switch overtime
      2) Statistically-wise there are many different ways to achieve an accurate model, that's why we use the RSSD as well in order to exclude those less "realistic" options already but still you might get some models that are a bit off on the attribution
      Hope this answers your question :)

    • @meheh002
      @meheh002 6 หลายเดือนก่อน +1

      ​@@cassandra4533 Hey thx for your response. The deviation between the models seems so big I'm wondering if the dangers of using this aren't greater than benefits.
      What would have to happen for us to be able to objectively pick the best one?

    • @cassandra4533
      @cassandra4533  6 หลายเดือนก่อน

      @@meheh002 Though to say with no context at all unfortunately. I'd probably suggest to start from the data, ensure those are complete and comprehensive and you did actually include all the relevat variables. Once that is out of the way I'd focus on the model. For cases where's there's lots of uncertainty dig deeper on why that's really happening, what the cause is. Often it could be driven by something like Multicollinearity, once you identify the cause you can define the solution. It could be an Incrementality Experiment for instance using GeoLift.