Absolutely excellent but I like to suggest a minor correction at 1:54. The linear regression DOES account for the uncertainly of the line; the linear regression prediction interval would produce intervals that widen as you move away from the data just like the Bayesian ones. (Many textbooks give the approximate formula for prediction intervals that don't widen but the actual non-approximate formula will give widening bands.)
My postgrad supervisor literally told me to watch this a few times just so I can explain it clearly to Human Sciences people in my research proposal. Thanks for all your effort making it!
Great balance between technical depth and intuition for me right now. I love how you say that multiplication "is like", but still is not. This gives intuition, but provides a warning for the day when we have come further in our understanding. 🤗
the first time I watched this a month or so a go I had no idea what was happening. However I have recently needed to use a GP and after a lot of reading up on them and coming back to this video, I can appreciate it a lot more with some understanding:)
Yea my topics require some prerequisite 😅 but with a little getting used to on the notation and basics of probability/stats, I think it should be fairly digestible. Glad you got something out of it.
I will have to watch your video several times to understand (if I can) everything but undoubtly your video is professional and very very well done !! congratulations
holy sh*t, you have unlocked the secret of GP and Bayesian stuff... I have struggled so hard to understand what is even GP as it is so abstract. Thank you so much for your great work!
I just discovered your videos yesterday and now they're popping up on my YT home screen and I feel a bit like a little boy in a toyshop. How have these high quality fantastic tutorials evaded me for so long, when I spend so much time looking at technical content on TH-cam? Seriously impressive! I'll definitely be one to share your videos when the opportunity arises.
It looks like you have optimized the hyperparameters of making an awesome video. So concise, but still a sprinkle of humor here and there. Awesome visualizations, so appreciated.
Literally the GOAT. So clear and concise, and the pace is perfect! A humble suggestion: adding a quick explanation of what GP does would be ideal (also, what are these sample prior, etc.) Dimension-wise, it would make more sense to me.
I guess this is the first time I am commenting in TH-cam! This tutorial is one of the best that I've ever seen on GPs and even math. The visual morph changes of the GP corresponding to different hyperparasite are fascinating. Wish you the best!
Great reference video, I'm sure I will come back to it again and again. The level of detail in all the simulations you do is just incredible. Do you make all your animations in manim?
Thanks brother! And i don’t use manim actually. I like representing data with Altair, which is like a better version of matplotlib. So I have a small library which turns Altair pics into vids.
Phenomenal video. I genuinely can't thank you enough for how accessible this was. I'm sure I'll come back and reference it, or your other work, as I continue preparing for my upcoming internship working on physics-informed neural networks.
Absolutely perfect! I heard of GPs and was wondering what they were exactly, wanted a bit of intuition of how and why they work, how to use them, just as a quick intro or motivation before learning them later on. This video answered all of this in a duration that is absolutely perfect: not too long so that it can be watched "leisurely", and not too short so that you still give enough information that I don't have the impression that I learnt nothing. Didn't know your channel, will definitely check the rest out!
Thank you - I'm getting a little better over time, but it's a work in progress. If you love what I'm doing, one thing that would be *huge* for me, is if you tell anyone you think might be interested. This channel is pretty small and it'll be easier to work on it if it gets a little more attention : )
Great video! Just dived into GPs by learning about their application in system identification techniques. In fact I'm learning for my examn right now and looked for a video that nicely sums up this topic and gives some intuition. This video matches my needs 100%, thank you very much.
You are the best man! Thank you for your videos, you 're helping a lot of students, because your explanations are so clear and intuitive. I Hope the best for you.
Thank you! I've been researching paper dedicated to the gaussian approach to time series prediction(as a task in a lab), and I really struggled with it. But after your video, everything has been sorted out in my head, and i finally have understood it!
As I wrote you on LinkedIn, this is probably the best video on GPs out there! I know it takes a long time to put together something of such high quality, but I hope I will see more of your videos in the future! 😊
Really grateful for this video. Got the gist of it but will have to pull out my old friend pen and paper and work through the math of the GP assumtion. Thanks for the neat definitions :)
I built a model years ago that I never realized is perhaps a GP model. I only learned about GP models a weeks ago. It doesn't use any real-valued data; only binary vectors. The similarity kernel is Hamming distance. Other than that, it's basically what he described here.
I would be really helped by putting variable definitions on screen while they're in use! I find myself forgetting what f and f* are for example as I mull it over and watch the explanation. Amazing video! I'm a fan.
Thanks Graham! It's always a balance thinking about what does/doesn't go on screen. More recently, I'm biasing towards *less* on screen, b/c I've gotten feedback that what's on screen can be overwhelming. But, if you have some question about what may be confusing, ask here and I may be able to help
@@Mutual_Information I'm thinking what was hard for me is that everything was defined and then they were used? the viewer needs to remember what each things means before they can give it the context, and context allows us to combine things and save on short term memory?
Your observation of the product of two normally distributed variables is true for the following reason: given independent scalar random variables X,Y, we have Var(XY) = Var(X)Var(Y) + Var(X) (E(Y))^2 + Var(Y) (E(X))^2. Given two multivariate random normals U,V with mean zero, we may choose to work in a basis (possibly different for the two distributions) where the covariance matrices are diagonal. The all components of each vector are independent and so Cov(U) Cov(V) = Cov(UV) by working element-wise. Since this is true in one basis, it must therefore be true in every basis.
I did understand just a few things, but still I watched this video till the very end - the production value is insane! And maybe I’ll need GPs in the future? :D You definitely deserve much more subscribers, your videos are great!
7:01 shouldn’t it be the other way around? If most ys are deemed different from an x, then the GP would sample closer ys and therefore the functions shouldn’t wiggle too much. Or am I missing something?
This is something I'm working on! I'd like to make code samples available alongside my videos. They aren't currently available b/c the modeling code is intertwined with the animation code, so it would make for a terribly difficult to decipher code if released as-is. My plan is.. once my video production workflow is a little more streamlined, I'll pair these video with code snippets.
I plan on making one. It’s a very interesting idea. In the meantime, there is already an excellent explanation : th-cam.com/video/i7LjDvsLWCg/w-d-xo.html
Great video!! I have a question about the graphs at 9:25. Shouldn't the heat map for the for the x vs x' look so that the K is highest at the origin (0,0) and fades moving to the other corners instead of what's shown? I might still not fully understand it. Thanks!
Excellent video, thanks. I have a question at the 1:20 mark. The distribution that you show, ‘p(y|x)’, does not look Gaussian. Am I missing something? Can a GP predict a non-Gaussian distribution?
Oh I see the confusion. That's not a GP model. That's just some non-parametric model to show the type of thing we're going for. It's there to draw a contrast when we start making assumptions. We assume the normal distribution at some point.. and that's what gives you the p(y|x) gaussian.. but in the general case, that's not necessarily true. But this isn't very clear in the vid, - sorry about the confusion :/
Awesome video!. Only one question. Minute 09:20. A linear kernel does not imply that the realizations of the random process must be linear, does it?. Thanks!!
Thanks Jesus. And regarding your Q, in the broader model, no a linear kernel doesn't imply the realizations need to be linear, since there is a noise component in the overall kernel. That allows points along a sample to be different in a nonlinear way.
thank you for your understandable video! I'm still wander what is the point of "similar y for similar x", is it make sure the function is smooth, or other usage? looking forward to your reply!
The goal of the problem is predict y for a given incoming x.. and we can learn to do this by observing many pairs of (x_i, y_i)'s. So we make an assumption: "If x1 is similar to x2 (that is K(x1, x2) is large/positive)), then we expect y1 and y2 to be close". With that assumption, we can form a prediction for y when given an x.. and that basically is formed by determining "which y value would best work with our similar-y's for similar x's given the x's and y's observed?" and then you can form your prediction that way. The GP does all this hard work for you and allows for noise and whatnot.
The best video about GP I have ever seen! Thank you for sharing. I would like to reproduce the graphs that you created in a script, but unfortunately I cannot see any code about it on you github page! It is possible to access to those scripts? with the examples that you produced?
Thank Matteo - I appreciate it! Unfortunately, the code for this one was heavily intertwined with the animation code, so I didn't make it public. But I wasn't doing anything you can't learn from reading the GPyTorch docs
Hey, great video! In practice in my machine learning class we did both GPR and GPC I found it very difficult to scale it to more than 10k samples. It seems that despite the advantages it has, it is not useful for a lot of practical problems. Can you maybe show show video on how to invert a matrix with less than O(n^3) complexity and which software someone could use for GPR/GPC for larger data?
Yea, so that's a big component of GP research. Getting the cost down. A dominate approach are inducing point methods, where you try to summarize a large dataset with a smaller data set of "inducing points". It's a popular approach, but introduces another source of uncertainty. In my experience, I tend to use GPs with smaller datasets.
It's very simple to extend to multi-dimensional inputs in fact. If x, x' are vectors, as long as K(x, x') returns a single number, then you can apply everything exactly as you see here. The visualizations will be a little trickier, but the whole idea still works.
If you were using a chi-distribution as a kernel could you combine kernel-a and kernel-b multiplicatively? If I recall, gaussian distributions are linear, ie: their sum is a gaussian, however there product is not. Chi-distributed variables on the other hand, when you multiply their products, you get an f-distribution, which is tractable. Really cool video! You definitely have an INSANE amount of material to make more videos on! Definitely subscribing!
Very interesting idea.. maybe there is a very special choice of kernel such that the multiplication kernel-and-sampled-function-distributions holds exactly, just like a sum does. I really don't know! If I had to guess, I'd say there is no such kernel. The problem arises from the multivariate normal, which is always operating in a GP, regardless of the kernel. And that problem is.. if you sample two vectors from a multivariate normal.. and multiply them together element wise.. the distribution of that thing is NOT some other multivariate normal. The kernel can only change the covariance of those two vectors, but that problem doesn't depend on those covariances. And thanks for the subscription!
Literally the best explanation on the internet for GP Regression Models. He's not trying to be cool, but genuinely trying to explain the concepts
Thank you my man. And yes, my risk of being cool is zero lol
Agreed, I've been trying to understand GPs for a task at work and this is the easiest to understand explanation I've found, liked and subbed!
Absolutely excellent but I like to suggest a minor correction at 1:54. The linear regression DOES account for the uncertainly of the line; the linear regression prediction interval would produce intervals that widen as you move away from the data just like the Bayesian ones. (Many textbooks give the approximate formula for prediction intervals that don't widen but the actual non-approximate formula will give widening bands.)
*of mean zero with a standard deviation of 4.20
My postgrad supervisor literally told me to watch this a few times just so I can explain it clearly to Human Sciences people in my research proposal. Thanks for all your effort making it!
I know how hard it is to explain this topic, so simply and comprehensively. I am extremely thankful for your efforts.
Great balance between technical depth and intuition for me right now. I love how you say that multiplication "is like", but still is not. This gives intuition, but provides a warning for the day when we have come further in our understanding. 🤗
Ha yea glad these details aren’t unnoticed. It’s a careful game making sure I never say anything *technically* wrong.
Yes sir sometimes to make a point you need to recontextualize the matter to specific to make things easier to understand@@Mutual_Information
I learn each time I rewatched the video. So much better than sitting lectures where you only listen once.
the first time I watched this a month or so a go I had no idea what was happening. However I have recently needed to use a GP and after a lot of reading up on them and coming back to this video, I can appreciate it a lot more with some understanding:)
Yea my topics require some prerequisite 😅 but with a little getting used to on the notation and basics of probability/stats, I think it should be fairly digestible. Glad you got something out of it.
The production value of this is insane
Really helpful. GPs finally clicked watching and working through this video. Great balance of accessibility + technical details. Thanks bro
I read the distill article and came back to watch the whole video again for the second time. Now it's crystal clear! Thanks so much!!
Distill is an epic educational source :)
@@Mutual_Information It's sad that they're in hiatus since last year :(
Hopefully they'll come back some day
The best explanation of Kernel so far!
Agreed! 😎
I will have to watch your video several times to understand (if I can) everything but undoubtly your video is professional and very very well done !! congratulations
Thanks Xavier!
The way you motivate the problem really adds insights for understanding.
holy sh*t, you have unlocked the secret of GP and Bayesian stuff... I have struggled so hard to understand what is even GP as it is so abstract. Thank you so much for your great work!
Happy to help my friend ;)
In min 3:00 I saw a smile coming out of my mouth, just how happy I was when I was listening to you!
This is a masterpiece work! Really thank you :)
Thank you very much - glad it's getting some love :)
I just discovered your videos yesterday and now they're popping up on my YT home screen and I feel a bit like a little boy in a toyshop. How have these high quality fantastic tutorials evaded me for so long, when I spend so much time looking at technical content on TH-cam? Seriously impressive! I'll definitely be one to share your videos when the opportunity arises.
Much appreciated! I got some really cool stuff coming in May. If you like this stuff, you'll *love* what's coming. Thanks again!
It looks like you have optimized the hyperparameters of making an awesome video. So concise, but still a sprinkle of humor here and there. Awesome visualizations, so appreciated.
haha I thought that was gonna be a nitpick, pleasantly surprised - thank you!
Literally the GOAT. So clear and concise, and the pace is perfect! A humble suggestion: adding a quick explanation of what GP does would be ideal (also, what are these sample prior, etc.) Dimension-wise, it would make more sense to me.
Great video. I've seen GPs mentioned a few times in papers and always glossed over it. Thanks for the great explanation!
Thanks
I guess this is the first time I am commenting in TH-cam! This tutorial is one of the best that I've ever seen on GPs and even math. The visual morph changes of the GP corresponding to different hyperparasite are fascinating. Wish you the best!
Awesome tutorial! Where could I find more information on GPs for graphs and varying length strings?
Truly amazing how you turned such a complex topic into an accessible explanation, thanks a lot!
Glad you liked it :)
What a smooth way to explain such complex math , thank you
Great reference video, I'm sure I will come back to it again and again. The level of detail in all the simulations you do is just incredible. Do you make all your animations in manim?
Thanks brother! And i don’t use manim actually. I like representing data with Altair, which is like a better version of matplotlib. So I have a small library which turns Altair pics into vids.
Perfect! - research, delivery, production, duration, pictorial intuitiveness, mathematical rigor, naïve friendly 👏🏽
Thank you Sourav - I'm trying!
Props for explaining such a complex model in a friendly way
Phenomenal video. I genuinely can't thank you enough for how accessible this was. I'm sure I'll come back and reference it, or your other work, as I continue preparing for my upcoming internship working on physics-informed neural networks.
That's awesome! Glad it helped
Absolutely perfect! I heard of GPs and was wondering what they were exactly, wanted a bit of intuition of how and why they work, how to use them, just as a quick intro or motivation before learning them later on.
This video answered all of this in a duration that is absolutely perfect: not too long so that it can be watched "leisurely", and not too short so that you still give enough information that I don't have the impression that I learnt nothing.
Didn't know your channel, will definitely check the rest out!
Thanks a lot! That's exactly what I'm going for. Relatively short and dense with useful info. Glad it worked for you.
Dude has named his channel mutual information so when we look for the concept of mutual information, all his videos will pop up 🤣 genius!
Such a clear and intuitive explanation of GPs! Great work!. Excellent video on this topic. Brief and elegant explanations!.
Thank you - that’s what I’m going for!
Man, you have some beautiful explanations, and the way you explain the details is somehow very simple to understand, thank you so much!
Glad you liked it!
Such a clear and intuitive explanation of GPs! Great work!
Thanks Minh :)
What the hell that's a great channel I'm so glad I found you. Production quality is spot on, thank you for taking such care !
Happy to have you! Welcome!
Thanks a lot for this vid man it literally saved my life, you're really one hell of a teacher
Thank you - I'm getting a little better over time, but it's a work in progress.
If you love what I'm doing, one thing that would be *huge* for me, is if you tell anyone you think might be interested. This channel is pretty small and it'll be easier to work on it if it gets a little more attention : )
@Mutual Information best of luck man 🫡
Great video! Just dived into GPs by learning about their application in system identification techniques. In fact I'm learning for my examn right now and looked for a video that nicely sums up this topic and gives some intuition. This video matches my needs 100%, thank you very much.
Excellent - you're the ideal viewer!
By far this is the best video I have seen on this subject! Thank you very much!
You are the best man! Thank you for your videos, you 're helping a lot of students, because your explanations are so clear and intuitive.
I Hope the best for you.
Thank you - you improved my Friday
Excellent video on this topic. Brief and elegant explanations!
A great video! Thank's. I used GP at work many years ago and enjoyed the framework a lot.
Thank you! I've been researching paper dedicated to the gaussian approach to time series prediction(as a task in a lab), and I really struggled with it. But after your video, everything has been sorted out in my head, and i finally have understood it!
Exactly what I've hoped to do - happy it helped!
Love the level you've pitched this video at.
This is truly a great explanation that helps me to connect all the dots together!! Thanks a lot!!!!
I'm glad it help. When I was studying GPs, these are the ideas that floated in my head - happy to share htem.
Quite literally
Badum tssch
Best video for GP I have seen! Thank you so much!
As I wrote you on LinkedIn, this is probably the best video on GPs out there! I know it takes a long time to put together something of such high quality, but I hope I will see more of your videos in the future! 😊
thanks, means a lot - and it's coming. This one has been taking awhile, but it'll be out soon :)
The visualizations are the catch. Just excellent 😊
a really really hard-core video... thanks D.J
Another great video! Love seeing each one come up
Really grateful for this video. Got the gist of it but will have to pull out my old friend pen and paper and work through the math of the GP assumtion. Thanks for the neat definitions :)
Now you make me love math again. Thanks.
A really good explanation! Though I wasn't able to understand everything, I would keep coming to this video until I do. ;D
Thank you. Happy to answer any questions too
I built a model years ago that I never realized is perhaps a GP model. I only learned about GP models a weeks ago. It doesn't use any real-valued data; only binary vectors. The similarity kernel is Hamming distance. Other than that, it's basically what he described here.
I would be really helped by putting variable definitions on screen while they're in use! I find myself forgetting what f and f* are for example as I mull it over and watch the explanation. Amazing video! I'm a fan.
Thanks Graham! It's always a balance thinking about what does/doesn't go on screen. More recently, I'm biasing towards *less* on screen, b/c I've gotten feedback that what's on screen can be overwhelming.
But, if you have some question about what may be confusing, ask here and I may be able to help
@@Mutual_Information I'm thinking what was hard for me is that everything was defined and then they were used? the viewer needs to remember what each things means before they can give it the context, and context allows us to combine things and save on short term memory?
So far beyond my abilities. Like Frankenstein's monster, I am soothed by its music.
Brilliant video! loved the graphics.
Amazing video on GP's .
You are a legend.
Your observation of the product of two normally distributed variables is true for the following reason: given independent scalar random variables X,Y, we have Var(XY) = Var(X)Var(Y) + Var(X) (E(Y))^2 + Var(Y) (E(X))^2. Given two multivariate random normals U,V with mean zero, we may choose to work in a basis (possibly different for the two distributions) where the covariance matrices are diagonal. The all components of each vector are independent and so Cov(U) Cov(V) = Cov(UV) by working element-wise. Since this is true in one basis, it must therefore be true in every basis.
Straight forward and explained well thank you
What an excellent video! Just thinking about the amount of effort that must have gone into this gives me anxiety
What an Explanation! Become fan in seconds.
I did understand just a few things, but still I watched this video till the very end - the production value is insane! And maybe I’ll need GPs in the future? :D
You definitely deserve much more subscribers, your videos are great!
Awesome Explanation. Thank you.
Excellent explanations and visualizations. Helped me a lot, thank you!
Super quality content! Thank you so much: I subscribed and I hope your number of subscribers increases more and more to motivate you to keep going!
Thank you - I hope so too!
This video is really nice. Thank you so much for creating this content material.
Super fricking impresed! Bravo
Really good explanation, the animations help so much. Thank you, I really appreciate it.
You're very welcome - Glad to hear it's landing as intended!
you might want to look into probabilistic numeric, cheers great video you made there!
7:01 shouldn’t it be the other way around? If most ys are deemed different from an x, then the GP would sample closer ys and therefore the functions shouldn’t wiggle too much. Or am I missing something?
Not sure if others are also interested, but I think a coding example with GPyTorch could be interesting.
This is something I'm working on! I'd like to make code samples available alongside my videos. They aren't currently available b/c the modeling code is intertwined with the animation code, so it would make for a terribly difficult to decipher code if released as-is. My plan is.. once my video production workflow is a little more streamlined, I'll pair these video with code snippets.
A truly fantastic explanation to them! The visuals were instructive and well presented, thank you for making this!
and thank you for watching ;)
Wooow! Excellent quality video!
Another great video. Keep up the good work!
Love these vids. Can you do a video about normalizing flows in the future?
I plan on making one. It’s a very interesting idea. In the meantime, there is already an excellent explanation : th-cam.com/video/i7LjDvsLWCg/w-d-xo.html
Great video!! I have a question about the graphs at 9:25. Shouldn't the heat map for the for the x vs x' look so that the K is highest at the origin (0,0) and fades moving to the other corners instead of what's shown? I might still not fully understand it. Thanks!
The heatmap just shows v*x*x'. I think the constant v here is .5. At the origin, it's .5*0*0 = 0. In the top right, it's .5*10*10 = 50.
Keep this up! It really helps
Excellent video, thanks. I have a question at the 1:20 mark. The distribution that you show, ‘p(y|x)’, does not look Gaussian. Am I missing something? Can a GP predict a non-Gaussian distribution?
Oh I see the confusion. That's not a GP model. That's just some non-parametric model to show the type of thing we're going for. It's there to draw a contrast when we start making assumptions. We assume the normal distribution at some point.. and that's what gives you the p(y|x) gaussian.. but in the general case, that's not necessarily true. But this isn't very clear in the vid, - sorry about the confusion :/
@@Mutual_Information got it. Thanks
Nice visualizations man. Just discovered your channel.
13:08 isnt X a d*n matric meaning that each row represents a feature and each column is a datapoint x?
Not in this case. Here, one row provides all the features for one example.
Awesome video!. Only one question. Minute 09:20. A linear kernel does not imply that the realizations of the random process must be linear, does it?. Thanks!!
Thanks Jesus. And regarding your Q, in the broader model, no a linear kernel doesn't imply the realizations need to be linear, since there is a noise component in the overall kernel. That allows points along a sample to be different in a nonlinear way.
Missed a lot of math, will get back later!
Great video ! ❤
This is excellent!
What a great video! Very helpful, thanks!
You're very welcome!
Great video. Thank you 😊
Really well made explanation :)
Thanks, glad the effort is appreciated!
Nice video - love it
Hey, love the videos. What software do you use to create your visuals?
I use a very dope, though static, python plotting library called Altair. And then I have a personal library that turns many of them into videos.
This is brilliant. Thank you.
Great video - I subbed!
thank you for your understandable video!
I'm still wander what is the point of "similar y for similar x", is it make sure the function is smooth, or other usage?
looking forward to your reply!
The goal of the problem is predict y for a given incoming x.. and we can learn to do this by observing many pairs of (x_i, y_i)'s. So we make an assumption: "If x1 is similar to x2 (that is K(x1, x2) is large/positive)), then we expect y1 and y2 to be close". With that assumption, we can form a prediction for y when given an x.. and that basically is formed by determining "which y value would best work with our similar-y's for similar x's given the x's and y's observed?" and then you can form your prediction that way. The GP does all this hard work for you and allows for noise and whatnot.
The best video about GP I have ever seen! Thank you for sharing. I would like to reproduce the graphs that you created in a script, but unfortunately I cannot see any code about it on you github page! It is possible to access to those scripts? with the examples that you produced?
Thank Matteo - I appreciate it!
Unfortunately, the code for this one was heavily intertwined with the animation code, so I didn't make it public. But I wasn't doing anything you can't learn from reading the GPyTorch docs
Hey, great video! In practice in my machine learning class we did both GPR and GPC I found it very difficult to scale it to more than 10k samples. It seems that despite the advantages it has, it is not useful for a lot of practical problems. Can you maybe show show video on how to invert a matrix with less than O(n^3) complexity and which software someone could use for GPR/GPC for larger data?
Yea, so that's a big component of GP research. Getting the cost down. A dominate approach are inducing point methods, where you try to summarize a large dataset with a smaller data set of "inducing points". It's a popular approach, but introduces another source of uncertainty.
In my experience, I tend to use GPs with smaller datasets.
Damn. Just Damn. This is great! Like: really really really great.
Thank you Ian - more good stuff cooking!
Regarding the modeling example shown at min 10, how would i go about doing this process but for multidimensional inputs?
It's very simple to extend to multi-dimensional inputs in fact. If x, x' are vectors, as long as K(x, x') returns a single number, then you can apply everything exactly as you see here. The visualizations will be a little trickier, but the whole idea still works.
Great video
This is amazing, thanks
If you were using a chi-distribution as a kernel could you combine kernel-a and kernel-b multiplicatively? If I recall, gaussian distributions are linear, ie: their sum is a gaussian, however there product is not. Chi-distributed variables on the other hand, when you multiply their products, you get an f-distribution, which is tractable.
Really cool video! You definitely have an INSANE amount of material to make more videos on! Definitely subscribing!
Very interesting idea.. maybe there is a very special choice of kernel such that the multiplication kernel-and-sampled-function-distributions holds exactly, just like a sum does. I really don't know!
If I had to guess, I'd say there is no such kernel. The problem arises from the multivariate normal, which is always operating in a GP, regardless of the kernel. And that problem is.. if you sample two vectors from a multivariate normal.. and multiply them together element wise.. the distribution of that thing is NOT some other multivariate normal. The kernel can only change the covariance of those two vectors, but that problem doesn't depend on those covariances.
And thanks for the subscription!
@@Mutual_Information Ah that makes sense; it's called a *gaussian process* not a *insert-random-pdf* process after all !
Thanks for the reply!
Thank to the video!!!
I think I would call the samples of functions inductive bias of the model.
Yea, the prior samples show the inductive bias. Intuitively, I think of it as "the model without data"