@apostoloumichail @apostoloumichail Determining the number of Gaussians is NP-hard problem. Therefore, we must make a guess. For example, we assume our process is as a result of one Gaussian, two Gaussians, or three Gaussians. Then we estimate the parameters by using EM-algorithm for all hypotheses. To decide which hypothesis fits our data most, we can apply Wilks' Theorem to check the likelihood of our data under these hypotheses, and choose the hypothesis with maximum likelihood.
Why is the latent variable z, a vector and specifically the unit vectors along each dimension? Cant we just work with latent variable z that is a scalar
Wait a second, isn't the formulation on 10.55 wrong? It should be sum symbol not product. Basically what we need to do is just integrating over z(in our case z is discrete latent variable so we need to use sum symbol not integral) to get the marginal distribution, and then for calculating the distribution of all N points, we need to take the product. This is the reason why we apply em algorithm. Because when we try to take the logarithm of the result equation, we can't do it directly because of the sum symbol in the equation. Then the em algorithm comes into play to solve this difficulty. Am I wrong?
肖晗 Well, I don't think Z is really an underlying generating mechanism of x. Let's say you have 'k' buckets with numbered balls. The numbers in each bucket are distributed according to a Gaussian distribution, each bucket with it's own parameters (\mu and \sigma^2 for the Gaussian). The Z's merely represent which bucket you pick, from what I understand of it. The \alpha_k's determine the probability that you pick the 'k'th bucket.
The video is great! But i have a question As i can see here a mixture of m gaussians is defined, where m is the number of classes. What do you do if you do not know the number of classes? Thanks, Mike
So in this video we have finitely number of zs. Is it possible to expand z to a set of unaccountably many elements? What would the joint pdf be in that case?
Consider the critical points: 1. Gaussian 1 and Gaussian 2 intersect 2. Gaussian 2 and 3 intersect Although at any given point in x, all the 3 Gaussians intersect, the above interactions between some specific Gaussians change the shape of the PDF at these points (mostly). At point 1, Gaussian 1's PDF is decreasing while Gaussian 2's PDF is increasing giving out the convex shape to the combined PDF near this point (decreasing and then increasing). At point 2, Gaussian 2's PDF is decreasing while Gaussian 3's PDF increases a little and then decreases giving the combined PDF a little bit of convexity near the point and then becomes concave.
Seems you are very much confused.... coz whenever you get stuck, you try to skip that doubt ! Observed this characteristic in almost all of your videos.
How wrong would it be if you had just said "A gaussian mixture model is simply a linear combination of m gaussians with uk and Ck" ? It would have saved first 10 minutes ....
This video is gold tried to understand this from other resources but didn't thanks a lot sir you are a real mathmonk !
Very nicely explained. I will jump to your EM video Right away! Thank you!
@apostoloumichail @apostoloumichail Determining the number of Gaussians is NP-hard problem. Therefore, we must make a guess. For example, we assume our process is as a result of one Gaussian, two Gaussians, or three Gaussians. Then we estimate the parameters by using EM-algorithm for all hypotheses. To decide which hypothesis fits our data most, we can apply Wilks' Theorem to check the likelihood of our data under these hypotheses, and choose the hypothesis with maximum likelihood.
Excelent video, great explanation and proof. Thanks alot.
Practical to follow, thanks.
I am lost as well. It is a good practice to always give examples at each stage instead of throwing lot of equations
I am sorry to break this to you, but you are a brainlet and should steer clear of this excellent video.
Could you be any more condescending? He is trying to learn.
if you hate equations you are studying the wrong subject my friend...
The first three minutes of the video consist of examples in one and two dimensions, without a single equation.
I like your videos very much. Thank you.
Great explanation, liked the colouring
Very useful video for my studies.
Precise and clear information, Thank you!
what is the relationship between x and Z?
Check out nonparametric Bayesian models if you don't have the number of classes in advance.
thanks for good video.
Why is the latent variable z, a vector and specifically the unit vectors along each dimension? Cant we just work with latent variable z that is a scalar
Thank you very much, It's really helpful
Some Wolfram Code:
Manipulate[
Plot[PDF[MixtureDistribution[ {5, 5}, {NormalDistribution[a, 2],
NormalDistribution[b, 2]}], x], {x, -6, 6}, Filling -> Bottom,
ColorFunction -> "Rainbow"], {a, -3, 1}, {b, -1, 3}]
Some segments in the video are stamped not adjacent to each other
Great videos! Have you thought about posting the pdf of your drawings from your videos as notes?
Wait a second, isn't the formulation on 10.55 wrong? It should be sum symbol not product. Basically what we need to do is just integrating over z(in our case z is discrete latent variable so we need to use sum symbol not integral) to get the marginal distribution, and then for calculating the distribution of all N points, we need to take the product. This is the reason why we apply em algorithm. Because when we try to take the logarithm of the result equation, we can't do it directly because of the sum symbol in the equation. Then the em algorithm comes into play to solve this difficulty. Am I wrong?
Gautam Garg I think Z is the underlying generating mechanism of x. Thus Z is called latent variable.
i concur
肖晗 Well, I don't think Z is really an underlying generating mechanism of x. Let's say you have 'k' buckets with numbered balls. The numbers in each bucket are distributed according to a Gaussian distribution, each bucket with it's own parameters (\mu and \sigma^2 for the Gaussian). The Z's merely represent which bucket you pick, from what I understand of it. The \alpha_k's determine the probability that you pick the 'k'th bucket.
Is that necessary to intersect the contours like you did?
Thanks. Very helpful. May I ask what is the technology you use to perform this lecture.
The video is great! But i have a question
As i can see here a mixture of m gaussians is defined, where m is the number of classes. What do you do if you do not know the number of classes?
Thanks,
Mike
For anyone else with this question, using a Baysian Information Criterion (BIC) will help.
So in this video we have finitely number of zs. Is it possible to expand z to a set of unaccountably many elements? What would the joint pdf be in that case?
I don't quite understand why it's a sum of alpha(x)*N(x) in P(x) but a multiply of alpha(x)*N(x) in P(x, z). Shouldn't they be the same?
Very useful.
Nice explanation. Which software do you use for the presentation?
good mythical morning!
m=2 here?
thanks for the video but it would be a lot better if it were legible
Good explanation, the written part could improve though.
Around 1:07 why do you say it is going to be a Convex combination of the individual pdfs? Why should it be convex?
Consider the critical points:
1. Gaussian 1 and Gaussian 2 intersect
2. Gaussian 2 and 3 intersect
Although at any given point in x, all the 3 Gaussians intersect, the above interactions between some specific Gaussians change the shape of the PDF at these points (mostly).
At point 1, Gaussian 1's PDF is decreasing while Gaussian 2's PDF is increasing giving out the convex shape to the combined PDF near this point (decreasing and then increasing).
At point 2, Gaussian 2's PDF is decreasing while Gaussian 3's PDF increases a little and then decreases giving the combined PDF a little bit of convexity near the point and then becomes concave.
I'm lost too
Seems you are very much confused.... coz whenever you get stuck, you try to skip that doubt !
Observed this characteristic in almost all of your videos.
probability and vectors together? Oh man
I do not seem to be able to link to "video lectures dot net", but you will find some really good explanations there.
I am lost!!!
thx :-)
How wrong would it be if you had just said "A gaussian mixture model is simply a linear combination of m gaussians with uk and Ck" ? It would have saved first 10 minutes ....
in fact, a GMM is not a LINEAR combination of m gaussians. Rather, It is a CONVEX combination of m gaussians
Actually, all CONVEX combinations are LINEAR combinations but not all linear combinations are convex.
Yeah true. I wanted to say that GMM is not any linear combination but it's of convex type.
Not really helpful as an introduction to be honest.