Scalable Bayesian Inference with Hamiltonian Monte Carlo

Tokyo Stan

มุมมอง 21 042

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 5 ม.ค. 2025

ความคิดเห็น •

@zool0941 ปีที่แล้ว ⁺¹
suuuucchhhh a great talk. really clear, thank you.
@nikitagupta9369 7 ปีที่แล้ว ⁺³
Loved the talk, helped understand the intuition of HMC. Thanks
@ProfessorBeautiful 8 ปีที่แล้ว ⁺¹
Excellent talk; thank you. And yes, to respond to your question at the end, it was that clear.
@oflasch 7 ปีที่แล้ว ⁺¹
Great talk, thank you!
@deepbayes6808 7 ปีที่แล้ว ⁺¹
Amazing talk.
@stevebez2767 7 ปีที่แล้ว
?watson
@Samanthaz 4 ปีที่แล้ว
Are your slides available? perhaps with the lecture transcript for each slide?
@stevebez2767 7 ปีที่แล้ว
Brands Hatch=Sim C egg,sam,pools? (Monile Radiation)
@stevebez2767 7 ปีที่แล้ว
[ibm 'hmc'?]
@alute5532 ปีที่แล้ว
Biased inferences, in wide data regime select bias
Add prior regulidr system gives math added to help
Likely what we learned in total qauntifies our info
Any stat question via manipulation of posterior
Resort to an expectation reduced t0 computing an interval
We do numerical wppriximation
As calculating exact is hard in D
To find expectation, identify where to focus our computation where is most contribution to those expectations.
Interesting density consider the volume (over that density)
High F lots of corners, hard
Volume increase fast exponentially
2 competing forces
1 volume wanna focus on large q
2 Density focus on mode
OMG (I. E. Normal) balances out in middle
Region concentration is the typical set. Look at surface around the mode
Markov chain : a way of finding exploring sets like that
It's a random function tao
After jump next time it will be a new distribution of points
We get a Markov chain
If we can engineer Markov chain to preserve our target distribution
Markov make us humans to typical set (start exploring that surface)
In m d every point is far from the typical set
End nice quantification of where probability really is
To compute any function average it over Markov chain history I. R Markov chain Monte Carlo mcmc
Long enough ensure we always converge to the true expectation
(always right answer)
Q how well can we do it?
2 how quick we converge to true expectation?.if transitions expensive like in white data
Exhaust computational resources long before we complete the exploration
Partial exploration means biased (missing probability) lots of mcmc aloha like that
Metropolis
1 proposal:add some noise
2 decide: accept reject proposal (based on where we come)
If closer to mode. The. Ccwp it
If away from Mode, we reject it
In MD volume is weird it doesn't scale. Outside typical set there is more volume
Only way is to shrink size of perturbation to a really small neighborhood
We won't go any where, just a tiny transition
End up v inefficient exploration, v poor mcmc
So avoid guessing checking p acceptance is v small
Use transition knows shape of our surface (how to stay on the contour?)
Need of automation
How extract info about the surface?
Hamiltonian mcmc uses diff geometry
Use vector field : assign direction to vectors if direction is right, don't guess anymore! Hence all new points lead to others on the same typical set
How: look at density of target fun
Take gradient of that function
Gradient is also a vector field
If we follow it it leads to mode (unuseful)
Potentially correct gradient
Differential geometry automatically correct the gradient
Physics planet orbit & it's field
Missing momentum transverse motion keeps us from falling
Too much momentum gravity won't catch us at all?!
Key add momenta in the right way
For all parameter q, add expand a momentum
2 lift up target distribution on this space
Find prob. Structure pi( p q)
How by conditional distribution
(for the momenta )
End join distribution, over momenta and distributions)
I always recover target distribution
I can project it down, get rid of momenta
use symplectic integrator can bound errors, transformation required from exact o approximate
Calculate how accurate the solution is by integrating over all deviations
Solution I'll n between cost of algorithm, and step size
End up getting lower bound upper bound (of error) x avg acceptance prob.
Y = cost
For almost all models relationship is bounded between. 2 lines
0.6 0.8 solution is near flat, near optimal
Choose step size so that avg x Aziz in. 0.6 0.8
Intuition hoe to
1 choose kinetic energy
2 choose integration time
3 step size
Fully automated
Devouple 2 steps of inference
1 modeling step we choose prior likelihood
2 computation step: compute those expectations
2 step size smaller
No step size work
Changing your model reimplementing in different way or recharging your priors
After ensures exact computation of necessary gradient
1 control stmts if else
2 prob. Density functions PDF Cdf
3 linear algebra addition multiplication decomposition
4 ode (nonstiff stiff)
Space equipped with Lie group to give a flow
typical set is meausre preserving flow
Adibotic Monte Carlo
multi modal distribution
@josephmargaryan หลายเดือนก่อน
what the fuck

ต่อไป

เล่นอัตโนมัติ

Michael Betancourt: Scalable Bayesian Inference with Hamiltonian Monte Carlo