Thomas Wiecki - Solving Real-World Business Problems with Bayesian Modeling | PyData London 2022
ฝัง
- เผยแพร่เมื่อ 7 ก.ค. 2024
- Thomas Wiecki Presents:
Solving Real-World Business Problems with Bayesian Modeling
Among Bayesian early adopters, digital marketing is chief. While many industries are embracing Bayesian modeling as a tool to solve some of the most advanced data science problems, marketing is facing unique challenges for which this approach provides elegant solutions. Among these challenges are a decrease in quality data, driven by an increased demand for online privacy and the imminent "death of the cookie" which prohibits online tracking. In addition, as more companies are building internal data science teams, there is an increased demand for in-house solutions.
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
00:00 Welcome!
0:05 Speaker introduction and PyMC 4 release announcement
1:15 PyMC Labs- The Bayesian consultancy
2:39 Why is marketing so eager to adopt Bayesian solutions
3:49 Case Study: Estimating Marketing effectiveness
6:00 Estimating Customer Acquisition Cost (CAC) using linear regression
7:36 Drawbacks of linear regression in estimating CAC
10:02 Blackbox Machine learning and its drawbacks
11:27 Bayesian modelling
11:52 Advantages of Bayesian modelling
14:12 How does Bayesian modelling work?
16:53 Solution proposals(priors)
17:26 Model structure
19:57 Evaluate solutions
20:16 Plausible solutions(posterior)
22:36 Improving the model
23:38 Modelling multiple Marketing Channels
24:51 Modelling channel similarities with hierarchy
26:13 Allowing CAC to change over time
28:00 Hierarchical Time Varying process
30:05 Comparing Bayesian Media Mix Models
30:47 What-If Scenario Forecasting
31:53 Adding other data sources as a way to help improve or inform estimates
33:00 When does Bayesian modelling work best?
33:35 Intuitive Bayes course
34:38 Question 1: Effectiveness of including variables seasonality?
36:03 Question 2: What is your recommendation for the best way to choose priors?
38:16 Question 3: How to test if an assumption about the data is valid?
39:07 Question 4: Do you take the effect of different channels on each other into account?
41:33 Thank you! - วิทยาศาสตร์และเทคโนโลยี
number one thomas, number one
Great talk! Thanks for sharing
Thank you very much for the talk. This was super interesting. I'm also building a Media Mix model at my company, and I have a question. How sensitive is this modeling framework to the scale of the data? That is, should one apply a max scaling or a standardization scaler? Should one scale the spend and revenue time series independently? Are there any best practices for this that you can link to or elaborate on?
Is the hierchical time series available anywhere?
someone knows the name of the function he uses to model saturation? (at 23:00)
23:20 - most infuriating part of the talk, wish I had a link to the notebook where he did that.
What is so infuriating about it?
@@pavellogacev94 he drastically improves the model with a "3-line code change" but doesn't show which three lines he changed
@@steeperdrip9188 Its like I want to tell people that I am smart but I don't exactly want to share it.
import aesara.tensor as at
def tanh_saturation(x, b, c):
return b * at.tanh(x / (b * c))
with pm.Model() as model:
# parameter = prior specification
baseline = pm.Normal("baseline", mu=200, sigma=300)
cac = pm.Normal("cac", mu=2.5, sigma=5)
saturation = pm.Normal("saturation", mu=500, sigma=80)
# linear regression
pred = tanh_saturation(ad_spend, saturation, 1/cac) + baseline
noise = pm.HalfNormal("noise", 100)
# likelihood
obs = pm.Normal("customers",
mu=pred,
sigma=noise,
observed=customers)
# inference button(TM)!
idata = pm.sample()