Machine learning - Introduction to Gaussian processes

แชร์
ฝัง
  • เผยแพร่เมื่อ 12 ธ.ค. 2024

ความคิดเห็น • 162

  • @life99f
    @life99f 2 ปีที่แล้ว +2

    I feel so fortunate to find this video. It's like walking in a fog and finally be able to see things clearly.

  • @maratkopytjuk3490
    @maratkopytjuk3490 8 ปีที่แล้ว +93

    Thank you, I tried to understand GP via papers, but only you could help me to build up understanding the idea. That is great that you took time to explain gaussian distribution and the important operations! You're the best!

    • @MrEdnz
      @MrEdnz 3 ปีที่แล้ว +5

      Learning a new subject via papers isn’t very helpful indeed :) They expect you to understand basic principles of GP. However lectures like these or books start with the basic principles💪🏻

  • @daesoolee1083
    @daesoolee1083 3 ปีที่แล้ว +3

    The best tutorial for GP among all the materials I've checked.

  • @augustasheimbirkeland4496
    @augustasheimbirkeland4496 2 ปีที่แล้ว +12

    5 minutes in and its already better than all 3 hours at class earlier today!

  • @fuat7775
    @fuat7775 ปีที่แล้ว +2

    This is absolutely the best explanation of the Gaussian!

  • @sourabmangrulkar9105
    @sourabmangrulkar9105 4 ปีที่แล้ว +5

    The way you started from basics and built up on it to explain the Gaussian Processes is very easy to understand. Thank you :)

  • @MattyHild
    @MattyHild 5 ปีที่แล้ว +13

    FYI Notation @22:05 is wrong. since he selected an x1 to condition on, he should be computing mu2|1 but he is computing mu1|2

  • @malharjajoo7393
    @malharjajoo7393 5 ปีที่แล้ว +1

    Basic summary of lecture video:
    1) Recap on multivariate Normal/Gaussian distribution (MVN).
    - some info on conditional probability
    2) Some information on how sampling can be done from Univariate/Multivariate Gaussian distribution.
    3) 39:00 - Introduction to Gaussian Process (GP)
    It is important to note that GP is considered as a Bayesian non-parametric approach/model

  • @erlendlangseth4672
    @erlendlangseth4672 7 ปีที่แล้ว +10

    Thanks, this helped me a lot. By the time you got to the hour mark, you had covered sufficient ground for me to finally understand gaussian processes!

  • @SijinSheung
    @SijinSheung 6 ปีที่แล้ว +10

    This lecture is so amazing! The hand drawing part is really helpful to build up intuition reagarding GP. This is a life-saving video to my finals. Many thanks!

  • @turkey343434
    @turkey343434 5 ปีที่แล้ว +10

    Gaussian processes start at 1:01:15

  • @akshayc113
    @akshayc113 9 ปีที่แล้ว +35

    Thanks a lot Prof. Just a minor correction for the people following the lectures. You made a mistake while writing out the formulae at 22:10
    You were writing out mean and variance of P(X1|X2) whereas the diagram was to find P(X2|X1). Since this is symmetric, you can just get them by appropriate replacements, but just letting slightly confused people know

    • @charlsmartel
      @charlsmartel 9 ปีที่แล้ว +2

      +akshayc113 I think all that should change is the formula for the given graphs. It should read:
      mu_21 = mu_2 + sigma_21 sigma_11*-1 (x_1 - mu_1). Everything else can stay the same.

    • @tobiaspahlberg1506
      @tobiaspahlberg1506 9 ปีที่แล้ว +2

      I think he actually meant to draw x_1 where x_2 is in the diagram. This switch would agree with the KPM formulae on the next slide.

  • @宇智波鼬337
    @宇智波鼬337 4 ปีที่แล้ว

    I've found so many lectures for understanding gaussian process. Until now you are the only one I think can make me understand it.. Thanks a lot man

  • @sarnathk1946
    @sarnathk1946 6 ปีที่แล้ว +9

    This is indeed an Awesome lecture! I liked the way the complexity is slowly built over the lecture. Thank you very much!

  • @HarpreetSingh-ke2zk
    @HarpreetSingh-ke2zk 3 ปีที่แล้ว

    I started learning about multivariate Gaussian processes in 2011, but it's terrible that I just got to this video when 2021 is ending.
    He explained things in a way that even a layperson could grasp.
    He first explains the meaning of the concepts, followed by an example/data, and last, theoretical representation. Typically, mathematic's presenters/writers avoid using data to provide examples.
    I'm always on the lookout for lectures like these, where the theoretical understanding is demonstrated through examples or data.
    Unless the concepts are not difficult to grasp, but the presenter/writer has made us go deep in order to open up complex notations without providing any examples.

  • @Ricky-Noll
    @Ricky-Noll 3 ปีที่แล้ว

    All time one of the best videos on TH-cam

  • @ziangxu7751
    @ziangxu7751 3 ปีที่แล้ว +2

    What an amazing lecture. It is much clearer than lectures taught in my university.

  • @KhariSecario
    @KhariSecario 3 ปีที่แล้ว +1

    Here I am in 2021, yet your explanation is the easiest one to understand from all the sources I gathered! Thank you very much 😍

    • @matej6418
      @matej6418 ปีที่แล้ว

      me in 2023, still the same

  • @AhmedAltakrouri
    @AhmedAltakrouri 3 หลายเดือนก่อน

    thank you for sharing this, this is best lecture I ever watched that gives a gentle introduction to Gaussian Processes.

  • @marcyaudrey6608
    @marcyaudrey6608 ปีที่แล้ว

    This lecture is amazing Professor. From the bottom of my heart, I say thank you.

  • @francescocanonaco5988
    @francescocanonaco5988 5 ปีที่แล้ว

    I tried to understand GP via blog article, paper and a lot of videos. Best video ever on GP! Thank you !

  • @LynN-he7he
    @LynN-he7he 4 ปีที่แล้ว

    Thank you, thank you thank you!! I was stuck on a homework problem and still figuring out what it means to be a testing vs. training data set and how the play a role in the Gaussian Kernel function. I was stuck for the last 3 days, and your video from about 45min - 1 hour mark made the lightbulb go off!

  • @Gouda_travels
    @Gouda_travels 3 ปีที่แล้ว

    after one hour of smooth explanation, he says and this brings us to Gaussian processes :)

  • @jx4864
    @jx4864 2 ปีที่แล้ว

    After 30mins, I am sure that he is top 10 teacher in my life

  • @pradeepprabakarravindran615
    @pradeepprabakarravindran615 11 ปีที่แล้ว +1

    Thank you ! Your videos are so much awesome than any ML lecture series I have seen so far ! -- Grad Student from CMU

  • @dennisdoerrich3743
    @dennisdoerrich3743 6 ปีที่แล้ว +2

    Wow, you saved my life with this genius lecture ! I think it's a pretty abstract idea with GP and it's nice that you can walk one through from scratch !

  • @MB-pt8hi
    @MB-pt8hi 6 ปีที่แล้ว +1

    Very good lecture, full of intuitive examples which deepens the understanding. Thanks a lot

  • @huitanmao5267
    @huitanmao5267 8 ปีที่แล้ว +1

    Very clear lectures ! Thanks for make them publicly available !

  • @AnilKumarnn
    @AnilKumarnn 3 หลายเดือนก่อน

    Best lecture in GP. Complement with examples in GPT or claude.

  • @bluestar2253
    @bluestar2253 3 ปีที่แล้ว

    One of the best teachers in ML out there!

  • @dwhdai
    @dwhdai 5 ปีที่แล้ว +3

    wow, this is probably the best lecture I've ever watched. on any topic.

  • @DistortedV12
    @DistortedV12 5 ปีที่แล้ว +1

    Finally! This is gold for beginners like me! Thank you Nando!! Saw you o the committee at the MIT defense, great questions!

  • @DanielRodriguez-or7sk
    @DanielRodriguez-or7sk 4 ปีที่แล้ว +1

    Thank you so much Professor De Freitas. What a clear explanation of GP

  • @richardbrown2565
    @richardbrown2565 4 ปีที่แล้ว +1

    Great explanation. I wish that the title mentioned that it was part one of two, so that I would have known it was going to take twice as long.

  • @xingtongliu1636
    @xingtongliu1636 6 ปีที่แล้ว

    This becomes very easy to understand with your thorough explanation. Thank you very much!

  • @Raven-bi3xn
    @Raven-bi3xn 4 ปีที่แล้ว

    Am I correct to think that the "f" notation in 30':30" is not the same "f" in 1:01':30"? In the latter case, each f consists of all the 50 f distributions that are exemplified in the former case?
    If that understanding is correct, then in sampling from the GP, each sample is a 50by1 vector from the 50D multivariate Gaussian distribution. This 50by1 vector is what Dr. Nando refers to as "distribution over functions".
    In other words, given the definition of a stochastic process as "indexed random variables", each random variable of GP is drawn from a multivariate Gaussian distribution. In that viewpoint, each "indexed" random variable is a function in 1:01':30".
    This lecture from 2013 is truly an amazing resource.

  • @philwebb59
    @philwebb59 3 ปีที่แล้ว

    1:05:58 Analog computers existed way before the first digital circuits. A WWII vintage electrical analog computer, for example, consisted of banks of op amps, configured as integrators and differentiators.

  • @malharjajoo7393
    @malharjajoo7393 5 ปีที่แล้ว +2

    1:04:08 - Would be good to emphasize that the test set is actually used for generating prior ... I had a hard time making sense out of it because
    the test set is usually provided separately (but in this case we are generating it !!)

  • @jingjingjiang6403
    @jingjingjiang6403 7 ปีที่แล้ว

    Thank you for sharing this wonderful lecture! Gaussian process was so confusing when it was taught in my university. Now it is crystal clear!

  • @sanjanavijayshankar5508
    @sanjanavijayshankar5508 4 ปีที่แล้ว

    Brilliant lecture. One could not have taught GPs better.

  • @taygunkekec9616
    @taygunkekec9616 10 ปีที่แล้ว

    Very clearly explained. The dependencies for learning the framework is concisely and incrementally given while details that make the framework harder to understand is elaborately evaded (You will understand what I mean if you try to dig through Rasmussen's book on GP).

  • @emrecck
    @emrecck 3 ปีที่แล้ว

    That was a great lecture Mr.Freitas, thank you very very much!
    I watched it to study my Computational Biology course, and it really helped.

  • @heyjianjing
    @heyjianjing 4 ปีที่แล้ว +1

    around 56:00, I don't think we should omit the condition sign on the mu*, that is conditioned on f: E(f*|f), not E(f*), otherwise, the expected value of f* alone should just be zero

  • @oliverxie9559
    @oliverxie9559 3 ปีที่แล้ว

    Really great video for reading Gaussian Processes for Machine Learning!

  • @jinghuizhong
    @jinghuizhong 9 ปีที่แล้ว

    The lecture is quite clear and it inspires me about the the key ideas of gaussian process.
    Many thanks!

  • @adrianaculebro9176
    @adrianaculebro9176 5 ปีที่แล้ว

    Finally understood how this idea is explained and applied using mathematical language

  • @bottomupengineering
    @bottomupengineering 10 หลายเดือนก่อน

    Great explanation and pace. Very legit.

  • @woo-jinchokim6441
    @woo-jinchokim6441 7 ปีที่แล้ว +1

    by far the best structured lecture on gaussian processes. love it :D

  • @dieg3005
    @dieg3005 8 ปีที่แล้ว +1

    Thank you very much Prof. de Freitas, excellent introduction

  • @austenscruggs8726
    @austenscruggs8726 2 ปีที่แล้ว

    This is an amazing video! Clear and digestible.

  • @sak02010
    @sak02010 5 ปีที่แล้ว +1

    thanks a lot prof. Very clean and easy to understand explanation.

  • @Jacob011
    @Jacob011 10 ปีที่แล้ว

    Absolutely superb lecture! Everything is clearly explained even with source code.

  • @pattiknuth4822
    @pattiknuth4822 3 ปีที่แล้ว

    Extremely good lecture. Well done.

  • @quantum01010101
    @quantum01010101 4 ปีที่แล้ว

    That is clear and flows naturally, Thank you very much.

  • @JaysonSunshine
    @JaysonSunshine 7 ปีที่แล้ว +2

    Correct me if I am wrong, but isn't the whole cluster of examples starting at 36:35 flawed? Nando shows three points in a single dimension: x1, x2, x3 and their corresponding f-values: f1, f2, f3. It seems these points are three samples from a univariate normal distribution with a scalar variance, rather than what he shows, i.e. a vector from R^3 with a 3x3 covariance matrix.

    • @JaysonSunshine
      @JaysonSunshine 7 ปีที่แล้ว

      On further reflection, perhaps you're doing a non-parametric approach in which you assign a Gaussian per point...
      ...since the distribution you're forming is empirical, it seems it would be more precise to to say the mean vector of the f-distribution is [f1, f2, f3], yes?

    • @DESYAAR
      @DESYAAR 7 ปีที่แล้ว

      I agree. That took me a while as well.

  • @maudentable
    @maudentable 4 ปีที่แล้ว

    a master doing his work

  • @chenqu773
    @chenqu773 2 ปีที่แล้ว

    It looks like that the notation of the axis in the graph on the right side of the presentation, @ around 20:39, is not correct. It could probably be the x1 on x-axis. I.e: it would make sense if μ12 refered to the mean of variable x1, rather than x2, judging from the equation shown on the next slide.

  • @saminebagheri4175
    @saminebagheri4175 7 ปีที่แล้ว +7

    amazing lecture.

  • @pankayarajpathmanathan7009
    @pankayarajpathmanathan7009 7 ปีที่แล้ว

    The best lecture for gaussian processes

  • @kiliandervaux6675
    @kiliandervaux6675 3 ปีที่แล้ว

    Thank you so much for this amazing lecture. I wanted to applaude at the end but I realised I was in front of my computer.

  • @niqodea
    @niqodea 5 ปีที่แล้ว +1

    BEAST MODE teaching

  • @sumantamukherjee1952
    @sumantamukherjee1952 9 ปีที่แล้ว

    Lucidly explained. Great video

  • @bingtingwu8620
    @bingtingwu8620 ปีที่แล้ว

    Thanks!!! Easy to understand👍👍👍

  • @darthyzhu5767
    @darthyzhu5767 8 ปีที่แล้ว +1

    really clear and comprehensive. thanks so much.

  • @dracleirbag5838
    @dracleirbag5838 3 ปีที่แล้ว

    I like the way you teach

  • @TheTacticalDood
    @TheTacticalDood 3 หลายเดือนก่อน

    This is amazing. Thanks so much!

  • @crestz1
    @crestz1 8 หลายเดือนก่อน

    Amazing lecturer

  • @GiiWiiDii
    @GiiWiiDii 4 ปีที่แล้ว +2

    23:56 That would be nice, thanks!

  • @AlqGo
    @AlqGo 7 ปีที่แล้ว +7

    39:55 your function seems to be exponential but the mean is assumed to be 0.......that's a really confusing example Prof.

    • @heyjianjing
      @heyjianjing 4 ปีที่แล้ว

      My understanding is that at 39:55, the mean refer to the prior information of the mean of f. Without any information, before seeing any data, zero is not a bad prior for the mean. Once you see the data, the mean is updated per equations at 56:00, here, the posterior mean is no longer zero.

  • @katerinapapadaki4810
    @katerinapapadaki4810 5 ปีที่แล้ว +1

    Thanks for the helful lecture!
    The only thing I want to point out is that if you put labels on the axises on your plots, it would be more helful for the listener to understand from the begging what you describe

  • @김수필-n4q
    @김수필-n4q 3 ปีที่แล้ว

    Awesome explanation. thanks

  • @kevinzhang4692
    @kevinzhang4692 3 ปีที่แล้ว

    Thank you! It is a wonderful lecture

  • @jhn-nt
    @jhn-nt 2 ปีที่แล้ว

    Great lecture!

  • @tospines
    @tospines 6 ปีที่แล้ว +3

    I think I got the essence of GP, but what I can not understand is why we take that the mean is 0 when clearly it is not 0. I mean, if we suppose that f* will be distributed as a gaussian with mean 0, the expectation value of f* must be 0. Could anyone explain me this fact?

    • @oskarkeurulainen6414
      @oskarkeurulainen6414 6 ปีที่แล้ว +1

      0 is only the mean for the prior for f*. When we know values of other variables that are correlated with f*, then we actually want to consider the mean when f* is conditioned on the other observed variables. Compare with the ellipse in the beginning with x1 and x2, both have mean 0 but if we observe one of them to be positive, the other one is also likely to be positive and thus has a positive conditional expectation.

  • @redberries8039
    @redberries8039 4 ปีที่แล้ว

    This was a good explanation.

  • @huuducdo143
    @huuducdo143 10 หลายเดือนก่อน

    Hello Nando, thank you for your excellent course.
    Following the bell example, the muy12 and sigma12 you wrote should be for the case that we are giving X2=x2 and try to find the distribution of X1 given X2=x2. Am I correct?
    Other understanding is welcomed. Thanks a lot!

  • @rajupowers
    @rajupowers 7 ปีที่แล้ว

    Symmetric positive definite intuition @18:00

  • @deephazarika2259
    @deephazarika2259 6 ปีที่แล้ว +2

    when estimating 'f', why each point is treated as a separate dimension and not different points in the same dimension?

    • @malekebadi9805
      @malekebadi9805 4 ปีที่แล้ว

      As far as I understood, Gaussian process (regression) serves two purposes: refining the prior (and posterior) and predicting the response for new points. If you collect new observations for the same points you are refining the posterior and if you extend your new point to a new dimension, you're predicting. In the former case, the confidence interval between two points remains relatively fat. Querying for points in new dimensions (given that practically you can do that) squeeze the confidence interval. Theoretically, it doesn't matter I guess. Think of an experiment in which you keep the x the same in every iteration but you read different y's. Think of another experiment in which your x values are changing from one iteration to another and you receives y's. From GP point of view, both are the same.

  • @xinking2644
    @xinking2644 2 ปีที่แล้ว

    if their is a mistake in 21:58 ? it should be condition on x1 instead of x2 ?

  • @swarnendusekharghosh9539
    @swarnendusekharghosh9539 3 ปีที่แล้ว

    Thankyou sir for a clear explanation

  • @MrStudent1978
    @MrStudent1978 2 ปีที่แล้ว

    1:12:24
    What is mu(x)? Is that different from mu?

  • @itai19
    @itai19 4 ปีที่แล้ว

    Thanks for the lecture, I have a problem with the discussion around 11 - from my understanding, a spherical case does represent some correlation between X and Y, as X is a sub-component of the max radius calculation, meaning larger x leads to smaller possible values of y (or at least lower probability for higher values). In other words, the covariance can be approximated to something like E[x*sqrt(r^2-x^2)]. Are we saying that ends up being zero, i.e. correlation is unable to express such a dependency?
    My intuition currently understands a square to express 0 correlation

  • @黃翰-g1p
    @黃翰-g1p 6 หลายเดือนก่อน

    isn't 22:19 the right side formula for x1|x2 not for x2|x1?

  • @ojussinghal2501
    @ojussinghal2501 2 ปีที่แล้ว

    36:45
    Regression

  • @rsilveira79
    @rsilveira79 6 ปีที่แล้ว

    Awesome lecture, very well explained!

  • @terrynichols-noaafederal9537
    @terrynichols-noaafederal9537 10 หลายเดือนก่อน

    For the noisy GP case, we assume the noise is sigma^2 * the identity matrix, which assumes iid. What if the noise is correlated, can we incorporate the true covariance matrix?

  • @EbrahimLPatel
    @EbrahimLPatel 9 ปีที่แล้ว

    Excellent introduction to the subject! Thank you :)

  • @RohitKumarGuptarkg
    @RohitKumarGuptarkg 9 ปีที่แล้ว

    Great lecture......A minor claification at 38:25 minute of the video, it is said that given X's you want to model f's. What do you exactly mean there?

    • @maratkopytjuk3490
      @maratkopytjuk3490 8 ปีที่แล้ว

      you want to describe the similarity between the f's via the given x's. The multivariate gaussian summerizes the connection/correlation between these (three) points

  • @ho4040
    @ho4040 2 ปีที่แล้ว

    Holy shit...what a good lecture

  • @afish3356
    @afish3356 4 ปีที่แล้ว

    An extremely good lecture! Thank you for recording this :) :)

  • @yunlongsong7618
    @yunlongsong7618 4 ปีที่แล้ว

    Great lecture. Thanks.

  • @JadtheProdigy
    @JadtheProdigy 6 ปีที่แล้ว +1

    Can someone explain why f is distributed with mean 0?

  • @KristoferPettersson
    @KristoferPettersson 6 ปีที่แล้ว +1

    If I run the example code I get an error stating that my K_ isn't a positive-definite matrix. What am I doing wrong?

  • @dhruv385
    @dhruv385 6 ปีที่แล้ว

    Wow! Great Lecture!

  • @ahaaha8462
    @ahaaha8462 5 ปีที่แล้ว

    amazing lecture, thanks a lot

  • @xesan555
    @xesan555 8 ปีที่แล้ว

    nando you are wondeful...

  • @yousufhussain9530
    @yousufhussain9530 9 ปีที่แล้ว

    Amazing lecture!

  • @brianstampe7056
    @brianstampe7056 5 ปีที่แล้ว

    Very helpful. Thanks!

  • @SimoneIovane
    @SimoneIovane 5 ปีที่แล้ว

    Great lesson! Thank you!

  • @homtom2
    @homtom2 9 ปีที่แล้ว

    This helped me so much! Thanks!