Variational Auto Encoder (VAE) - Theory

แชร์
ฝัง
  • เผยแพร่เมื่อ 25 ม.ค. 2023
  • VAE's are a mix between VI and Auto Encoders NN. They are used mainly for generating new data. In this video we will outline the theory behind the original paper, including looking at regular Auto Encoders, Variational Inference, and how they mix together to create VAE.
    Original Paper (Kingma & Welling 2014): arxiv.org/pdf/1312.6114
    The first and only Variational Inference (VI) course on-line!
    Become a member and get full access to this online course:
    meerkatstatistics.com/courses...
    ** 🎉 Special TH-cam 60% Discount on Yearly Plan - valid for the 1st 100 subscribers; Voucher code: First100 🎉 **
    “VI in R” Course Outline:
    Administration
    * Administration
    Intro
    * Intuition - what is VI?
    * Notebook - Intuition
    * Origin, Outline, Context
    KL Divergence
    * KL Introduction
    * KL - Extra Intuition
    * Notebook - KL - Exercises
    * Notebook - KL - Additional Topics
    * KL vs. Other Metrics
    VI vs. ML
    * VI (using KL) vs. Maximum Likelihood
    ELBO & “Mean Field”
    * ELBO
    * “Mean Field” Approximation
    Coordinate Ascent VI (CAVI)
    * Coordinate Ascent VI (CAVI)
    * Functional Derivative & Euler-Lagrange Equation
    * CAVI - Toy Example
    * CAVI - Bayesian GMM Example
    * Notebook - Normal-Gamma Conjugate Prior
    * Notebook - Bayesian GMM - Unknown Precision
    * Notebook - Image Denoising (Ising Model)
    Exponential Family
    * CAVI for the Exponential Family
    * Conjugacy in the Exponential Family
    * Notebook - Latent Dirichlet Allocations Example
    VI vs. EM
    * VI vs. EM
    Stochastic VI / Advanced VI
    * SVI - Review
    * SVI for Exponential Family
    * Automatic Differentiation VI (ADVI)
    * Notebook - ADVI Example (using STAN)
    * Black Box VI (BBVI)
    * Notebook - BBVI Example
    Expectation Propagation
    * Forward vs. Reverse KL
    * Expectation Propagation
    Variational Auto Encoder
    Why become a member?
    * All video content
    * Extra material (notebooks)
    * Access to code and notes
    * Community Discussion
    * No Ads
    * Support the Creator ❤️
    VI (restricted) playlist: bit.ly/389QSm1
    If you’re looking for statistical consultation, someone to work on interesting projects, or give training workshops, visit my website meerkatstatistics.com/ or contact me directly at david@meerkatstatistics.com
    ~~~~~ SUPPORT ~~~~~
    Paypal me: paypal.me/MeerkatStatistics
    ~~~~~~~~~~~~~~~~~
    Intro/Outro Music: Dreamer - by Johny Grimes
    • Johny Grimes - Dreamer

ความคิดเห็น • 16

  • @paedrufernando2351
    @paedrufernando2351 5 หลายเดือนก่อน +3

    @6:10 VI starts.The run down was awesome..puts eveything into perspective

  • @user-qt5xd9lk5j
    @user-qt5xd9lk5j 5 หลายเดือนก่อน +1

    Fantastic video! This effectively resolved my queries.

  • @evgeniyazarov4230
    @evgeniyazarov4230 6 หลายเดือนก่อน

    Great explanation! The two ways of looking on the loss function is insightful

  • @123sendodo4
    @123sendodo4 ปีที่แล้ว

    Very clear and useful information!

  • @shounakdesai4283
    @shounakdesai4283 4 หลายเดือนก่อน

    awesome video.

  • @evaggelosantypas5139
    @evaggelosantypas5139 ปีที่แล้ว +1

    Hey great video, thank you for your efforts. Is it possible to get your slides ?

    • @MeerkatStatistics
      @MeerkatStatistics  ปีที่แล้ว +2

      Thanks. The slides are offered on my website meerkatstatistics.com/courses/variational-inference-in-r/lessons/variational-auto-encoder-theory/ for members. Please consider subscribing to also support this channel.

    • @evaggelosantypas5139
      @evaggelosantypas5139 ปีที่แล้ว

      @@MeerkatStatistics ok thnx

  • @minuklee6735
    @minuklee6735 2 หลายเดือนก่อน

    Thank you for the awesome video! I have a question @11:35. I don't clearly understand why g_\theta takes x. am I correct that it does not take x if g_\theta is a gaussian distribution? as it will just be g_\theta(\epsilon) = \sigma*\epsilon + \mu (where \sigma and \mu comes from \theta)??
    Again, I appreciate your video a lot!

    • @MeerkatStatistics
      @MeerkatStatistics  หลายเดือนก่อน

      Although not explicitly denoted, q(z) is also dependent on the data. This is why g(theta) will usually also be depending on x. I didn't want to write q(z|x) as in the paper, because it is not a posterior, but rather a distribution who's parameters you tweak until it reaches the true posterior p(z|x).
      I have a simple example (for the CAVI algorithm) on my website (for members) meerkatstatistics.com/courses/variational-inference-in-r/lessons/cavi-toy-example/
      and also a bit more elaborate example free on TH-cam th-cam.com/video/8DzIPZnZ12k/w-d-xo.htmlsi=8Un505QqOEtij9XV - in both cases you'll see a q(z) that is a Gaussian, but whose parameters depend on the data x.

  • @marcospiotto9755
    @marcospiotto9755 14 วันที่ผ่านมา

    What is the difference between denoting p_theta (x|z) vs p(x|z,theta) ?

    • @MeerkatStatistics
      @MeerkatStatistics  14 วันที่ผ่านมา

      I think "subscript" theta is just the standard way of denoting when we are optimizing theta, that is we are changing theta. While "conditioned on" theta is usually when the theta's are given. Also note that the subscript theta refers to the NN parameters, while often the "conditioned on" refers to distributional parameters. I don't think these are rules set in stone, though, and I'm not an expert in notation. As long as you understand what's going on - that's the important part.

    • @marcospiotto9755
      @marcospiotto9755 14 วันที่ผ่านมา

      @@MeerkatStatistics got it, thanks!

  • @stazizov
    @stazizov หลายเดือนก่อน

    Could you please tell me if there is a mistake in the notation? @8:26 z_{i} = z_{l}?

    • @MeerkatStatistics
      @MeerkatStatistics  หลายเดือนก่อน +1

      Hey, yes of course. Sorry for the typo.

    • @stazizov
      @stazizov หลายเดือนก่อน

      ​@@MeerkatStatistics Thank you so much) Great video!!! 🔥