ความคิดเห็น •

  • @alisalimy9387
    @alisalimy9387 4 ปีที่แล้ว +1

    Hard to find a good explanation of this problem, until i found this! Great job Alexander!!!

  • @dsilvavinicius
    @dsilvavinicius 7 ปีที่แล้ว +5

    Finally a good explanation of the geometry interpretation of two-dimensional Gaussian! Great job!

  • @rajanalexander4949
    @rajanalexander4949 2 ปีที่แล้ว +2

    Great explanation -- especially the graphical interpretation and example. Thank you!

  • @user-sj2zu9rn9q
    @user-sj2zu9rn9q 4 ปีที่แล้ว

    Thanks for you. Alexander. The best one I have seen.

  • @MacMac0710
    @MacMac0710 5 ปีที่แล้ว +29

    This is great because you explain notation as well as giving solid examples!

    • @blasttrash
      @blasttrash ปีที่แล้ว

      at 6:30 at the bottom right there is a contour plot where its printed that
      (sigma_11)^2 > (sigma_22)^2
      What exactly is sigma_11 in that diagram? Is it the distance from center point of the contour plot to first concentric circle? or is it distance from center to 2nd concentric circle? or is it distance from center to 3rd concentric circle? Or is it something else? Similarly what is sigma_22?

    • @prathamhullamballi837
      @prathamhullamballi837 ปีที่แล้ว

      @@blasttrash When you look at the contour plot but only taking x axis, then the variance associated with distribution along x-axis is (sigma_11)^2. Similarly, for y-axis, it would be (sigma_22)^2. Look at how the 'spread' in the contour plot along x-axis is more than the same along y-axis? That is precisely what we mean by (sigma_11)^2 > (sigma_22)^2.
      Note that the circles are just contour plots and the distance from it to the centre doesn't necessarily mean it is sigma_11 or anything.

  • @amirkeramatian653
    @amirkeramatian653 7 ปีที่แล้ว +3

    Very helpful video with clear explanations. Thanks a lot!

  • @jiongwang7645
    @jiongwang7645 11 ปีที่แล้ว

    thank you very much, this is succinct and easy to understand, way better than many text books !!

  • @visheshsinha_
    @visheshsinha_ 3 ปีที่แล้ว

    Thank You so much , I was struggling to understand this , you made it really simple.

  • @christinhainan
    @christinhainan 11 ปีที่แล้ว +6

    I find your TH-cam videos much more helpful to learn - compared to the class videos. Maybe because I suffer from short attention span.

  • @K4moo
    @K4moo 10 ปีที่แล้ว +2

    Thank you for sharing, very useful.

  • @renato5668
    @renato5668 2 ปีที่แล้ว

    This is a great explanation, it helped a lot

  • @avijoychakma8678
    @avijoychakma8678 5 ปีที่แล้ว +1

    Nice explanation. Thank you so much.

  • @karthiks3239
    @karthiks3239 10 ปีที่แล้ว

    Really nice video.. Thanks a lot.. !

  • @amizan8653
    @amizan8653 10 ปีที่แล้ว +4

    that was extremely helpful, thanks for posting!

  • @technokicksyourass
    @technokicksyourass 6 ปีที่แล้ว +11

    The summary at the end was the best part. I would have liked some more explanation on what the different shapes of the contour plot mean.

    • @omarebacc07
      @omarebacc07 3 ปีที่แล้ว

      When covariance values in the covariance matrix (the non-diagonal values) are or tend to 1, means that the shapes of the contour will look like ellipses incline with aprox 45 degrees or follow a rect line(positive association between variables). In contrast, when the covariance values are equal to zero, means that the shape of the curves will be similir to a circle, i.e, there is no asociation between the variables (similar to figure in min 6:13).

  • @PravNJ
    @PravNJ 4 ปีที่แล้ว

    Thank you. This was helpful!

  • @ProfessionalTycoons
    @ProfessionalTycoons 5 ปีที่แล้ว +1

    thank you for this post!

  • @nyctophilic1790
    @nyctophilic1790 4 ปีที่แล้ว

    Thank you so much , awsome work

  • @tomt8691
    @tomt8691 7 ปีที่แล้ว

    This is fantastic!
    Thank you!

  • @ProfessionalTycoons
    @ProfessionalTycoons 5 ปีที่แล้ว

    clear explanation very good

  • @osamaa.h.altameemi5592
    @osamaa.h.altameemi5592 10 ปีที่แล้ว

    Very nice video thank you.

  • @chyldstudios
    @chyldstudios 2 ปีที่แล้ว

    Solid explanation.

  • @hcgaron
    @hcgaron 6 ปีที่แล้ว

    is the vector x assumed to be a row vector? I ask only because we have x - mu which is a row vector inside the exponential. To subtract components, would we not assume that x is a row vector like mu?

  • @ZLYang
    @ZLYang 11 หลายเดือนก่อน +1

    At 4:32, if x and μ are row vectors, [x-μ] should also be a row vector. Then how to multiply (Σ^(-1))* [x-μ]? Since the dimension of (Σ^(-1)) is 2*2, and the dimension of [x-μ] is 1*2.

  • @user-ob2pe2wx7u
    @user-ob2pe2wx7u 2 ปีที่แล้ว

    Ha, the approach of decomposing the covariance matrix would be a nice example of PCA!

  • @spyhunter0066
    @spyhunter0066 2 ปีที่แล้ว

    I'd like to know how you call your x value for univariate caseü or x value set for multivariate case in your Gaussian distribuitons? Do you name them as "data set" or " variable set"? Also, what makes the mean value size same as the x data size? Thanks in advance. Should we think that we create one mean average for every added x data point in our data set? That's why we average them when we find the best estimated value in the end.

  • @andrew-kd4jk
    @andrew-kd4jk 11 ปีที่แล้ว

    very good tutorial

  • @nates3361
    @nates3361 2 ปีที่แล้ว

    Excellent explanation

  • @parshantjuneja4811
    @parshantjuneja4811 2 ปีที่แล้ว

    Thanks dude! I get it now! Well almost ;)

  • @spyhunter0066
    @spyhunter0066 2 ปีที่แล้ว

    In the formula at the minute 2.11, when you find the inverse of a Sigma matrix in the exp(...) , do you use unit matrix method, any coding , or some other method? Cheers.

  • @emirlanaliiarbekov8729
    @emirlanaliiarbekov8729 2 ปีที่แล้ว

    clearly explained!

  • @spyhunter0066
    @spyhunter0066 2 ปีที่แล้ว

    Could you explain more about the sum of the vectors in your notations for the maximum likelihood estimates at the minute 1.45? As far as I have noticed, there has been only one data set, namely one x vector. Thus, what actually are you summing up with j indices? Cheers.

  • @abdoelrahmanbashir4096
    @abdoelrahmanbashir4096 4 ปีที่แล้ว

    thank you teacher :)

  • @kaushik900
    @kaushik900 7 ปีที่แล้ว +1

    At 11:02, you mean Xb=X*sqrt(EIGEN VALUE MATRIX) right?

  • @100uo
    @100uo 10 ปีที่แล้ว

    awesome, thank you man!

  • @utsavdahiya3729
    @utsavdahiya3729 5 ปีที่แล้ว

    Thank youuuuuuuuuu♥️♥️♥️♥️♥️♥️♥️

  • @shivampadmani_iisc
    @shivampadmani_iisc 5 หลายเดือนก่อน

    Thank you so much so much sooooo much

  • @samarths
    @samarths 7 ปีที่แล้ว

    thanks a lot

  • @alaraayhan7762
    @alaraayhan7762 3 ปีที่แล้ว

    thank you !!

  • @CSEfreak
    @CSEfreak 10 ปีที่แล้ว

    AMazing thank you

  • @elumixor
    @elumixor 4 ปีที่แล้ว +1

    I think there is an error in the maximum likelihood formula in the order of vector multiplication. The way you have it makes the operation a dot product, not the outer product.

  • @georgestamatelis7812
    @georgestamatelis7812 3 ปีที่แล้ว

    thank you

  • @spyhunter0066
    @spyhunter0066 2 ปีที่แล้ว

    should we get x vector also as a row vector with length d just like nü (mean) vector at the minute of 1.44!

  • @RonnyMandal75
    @RonnyMandal75 7 ปีที่แล้ว +47

    Haha, why would someone vote this down? This is great!

    • @boyangchen5544
      @boyangchen5544 5 ปีที่แล้ว +1

      exactly the best I can find

    • @chrischoir3594
      @chrischoir3594 4 ปีที่แล้ว +3

      They voted it down because hey are probably democrats and they don't like truth and facts

    • @llleiea
      @llleiea 4 ปีที่แล้ว +4

      Ronny Mandal maybe bc there are some small mistakes

    • @fupopanda
      @fupopanda 4 ปีที่แล้ว +3

      He does have mistakes and really bad inconsistencies throughout the slides. Not enough to dislike though, but enough to not be surprised of the dislikes.

    • @LegeFles
      @LegeFles 3 ปีที่แล้ว +1

      @@chrischoir3594 I thought the republicans don't like truth and facts

  • @ayasalama7965
    @ayasalama7965 6 ปีที่แล้ว

    in 12:45 shouldn't the expression on top of the graph be XD rather than XC ? great video !

  • @laurent__9032
    @laurent__9032 5 ปีที่แล้ว

    Love your videos! Isn't there a small mistake where you place your transpose ? Should'nt it be $\Delta^2=(x-\mu)^T\Sigma(x-\mu)$ instead ?

  • @martynasvenckus423
    @martynasvenckus423 2 ปีที่แล้ว +1

    At 5:32, Alexander says "The scaling of the sigmas is accomplished by creating a diagonal covariance matrix". Could you explain what does "scaling of the sigmas" mean? Where are they being scaled? Thanks

    • @timvandewauw1045
      @timvandewauw1045 2 ปีที่แล้ว

      When calculating the joint distribution p(x1)p(x2) for vector x_underlined = [x1 x2], he vectorizes (x1-mu2) and (x1-mu2) to the vector form (x_underlined-mu_underlined). I believe what he means by scaling of the sigmas, is a similar transformation from two seperate, scalar sigmas to a matrix, in this case the covariance matrix Sigma.

  • @GundoganFatih
    @GundoganFatih 3 ปีที่แล้ว

    6:28 why do we create a diagonal cov. matrix. Let X be a feature set of two features (mx2), shouldn't sigma be cov(X)?

  • @hayekpower5464
    @hayekpower5464 3 ปีที่แล้ว

    Why does x is a row vector instead of column vector?

  • @user-ru9rm3rc7u
    @user-ru9rm3rc7u 16 วันที่ผ่านมา

    Thanks for wonderful explanation Do you share slides?

  • @dc6940
    @dc6940 4 ปีที่แล้ว

    So, when features are independent, finding P(x1) and P(x2) individually and then multiplying is same as finding using multivariate gaussian distribution 6:13 ? Is my understanding correct?

  • @heyptech1726
    @heyptech1726 6 ปีที่แล้ว

    nice

  • @d-rex7043
    @d-rex7043 2 ปีที่แล้ว

    This should be mandatory viewing, before being assaulted with the symbolic derivations!

  • @snesh93
    @snesh93 3 ปีที่แล้ว

    From 4:12 to 6:24 where is an explanation on the Independent Gaussian models, I have a basic doubt on the Sigma calculation. I am finding hard to understand that sigma needs to be a diagonal matrix of (sigma_1*sigma_1 , sigma_2*sigma_2), shouldnt it be a matrix of the form [[sigma_1*sigma_1, sigma_1*sigma_2], [sigma_2*sigma_1, sigma_2*sigma_2]] ? Can anyone explain that to me ?

    • @AlexanderIhler
      @AlexanderIhler 2 ปีที่แล้ว

      The covariance matrix of a zero man Gaussian has entries sig_ij = E[xi xj]. So if xi and xj are independent, this is zero except along the diagonal. I think you’re describing a rank 1 matrix? Which is different from independence in probability.

  • @thomasbloomfield4070
    @thomasbloomfield4070 7 ปีที่แล้ว +1

    At 11:00 isn't that the eigenvalue matrix, not the eigenvector matrix?
    Thanks for the great video!

    • @pr749
      @pr749 7 ปีที่แล้ว

      Yes, it is the singular value matrix. (square root of eigenvalue matrix)

  • @lemyul
    @lemyul 4 ปีที่แล้ว

    thanks alexa

  • @spyhunter0066
    @spyhunter0066 2 ปีที่แล้ว

    At the minute of 1.34, the maximum likelihood estimates formula has 1 over N coefficient. On the other hand, at the minute of 3.13, there is 1 over m coefficients. We know that N and m is the total number of values in the sums, but what is the reason you used different notations as N and m. Is it just to seperate univariate and multivariate cases while they keep their definitions (or meaning)? Also, the j values in the lower and upper limits of sum sembols are not so clear in this notation. Should we write j=1 to j=m or N for instance?

  • @farajlagum
    @farajlagum 9 ปีที่แล้ว

    Thumb up!

  • @muratakjol1437
    @muratakjol1437 3 ปีที่แล้ว +1

    Summary: 13:02

  • @user-bz8nm6eb6g
    @user-bz8nm6eb6g 4 ปีที่แล้ว

    wow

  • @spyhunter0066
    @spyhunter0066 2 ปีที่แล้ว

    One more question about the example at the minute of 4.24, you said independent x1 and x2 variables. Independendent of what??? As far as I see, you can have 2 univariate formula like in this example, but when you combine them to see the combined likelihood, you have to have a mean vector in size of 2 and Sigma matrix iin size of 2x2. That's always the case, right? The size of the mean vector and the Sigma matrix look like defined by the number of combination of x values. Is that right? I saw another example somewhere else, you can have L(μ=28 ,σ=2 | x1=32 and x2=34) for instance to find the combined likelihood at x1=32 and x2=34, and he uses only one mean and sigma for both. REF:th-cam.com/video/Dn6b9fCIUpM/w-d-xo.html&ab_channel=StatQuestwithJoshStarmer

  • @samfriedman5031
    @samfriedman5031 6 หลายเดือนก่อน

    4:07 MLE for sigma-hat should be X by X-transpose (outer product) not X-transpose by X (inner product)

  • @quangle5701
    @quangle5701 3 ปีที่แล้ว

    Can anyone explain how to vectorize the formula at 5:16? Thanks

  • @livershotrawmooseliver2498
    @livershotrawmooseliver2498 10 ปีที่แล้ว

    What is meant by compressing a 2D Gaussian function in 3D?

    • @AlexanderIhler
      @AlexanderIhler 10 ปีที่แล้ว +1

      Sorry; where is that?
      Most likely I simply meant that, to draw a 2D Gaussian distribution requires a 3D drawing -- 2 variables x1,x2, plus the probability p(x1,x2). It's inconvenient to try to render 3D functions, so we usually plot contours in 2D instead (x1 and x2), with the contours indicating the lines of equal probability, p(x1,x2)=constant.

    • @livershotrawmooseliver2498
      @livershotrawmooseliver2498 10 ปีที่แล้ว

      Is it possible to compress a 2D Gaussian function?

  • @bingbingsun6304
    @bingbingsun6304 11 หลายเดือนก่อน

    学习

  • @torTHer68
    @torTHer68 3 ปีที่แล้ว

    ale beka xd

  • @ilyaskapenko8089
    @ilyaskapenko8089 4 ปีที่แล้ว

    at th-cam.com/video/eho8xH3E6mE/w-d-xo.html
    Why Delta^2 = (x-mu) * Σ^-1 * (x-mu)^T, not
    Delta^2 = (x-mu)^T * Σ^-1 * (x-mu)?

  • @austikan
    @austikan 5 ปีที่แล้ว +1

    this guy sounds like Archer.

  • @Tokaexified
    @Tokaexified 5 ปีที่แล้ว +1

    I fell asleep watching this video with both hands under my head…when I woke up both of them had fell seep asleep and wouldn't wake up in a while..

  • @amitcraul
    @amitcraul 6 ปีที่แล้ว +1

    at 9:24 Σ= UΛU^-1 instead of Transpose

    • @AlexanderIhler
      @AlexanderIhler 6 ปีที่แล้ว +1

      U is a unitary matrix, so they're the same

    • @ProfessionalTycoons
      @ProfessionalTycoons 5 ปีที่แล้ว

      Orthogonal matrix inverse == transpose

  • @harshitk11
    @harshitk11 2 ปีที่แล้ว

    x needs to be a column vector instead of row vector.

  • @spyhunter0066
    @spyhunter0066 2 ปีที่แล้ว

    At 5.23, you should have said (x-mu) transpose.

    • @AlexanderIhler
      @AlexanderIhler 2 ปีที่แล้ว +1

      These slides have a number of transposition notation errors, due to my having migrated from column to row notation that year. Unfortunately TH-cam does not allow updating videos, so the errors remain. It should be clear in context, since i say “outer product” for the few non inner products.

    • @spyhunter0066
      @spyhunter0066 2 ปีที่แล้ว +1

      @@AlexanderIhler NO worries, we spot them.

  • @OrhaninAnnesi
    @OrhaninAnnesi 7 ปีที่แล้ว +1

    please stop using probability density and probability interchangeably. The formula for a normal distribution never gives a probability, but a probability density, which can be greater than 1.

  • @umbhutta
    @umbhutta 4 ปีที่แล้ว

    wow 1.5K supporter and just 40 haters :P

  • @danny-bw8tu
    @danny-bw8tu 6 ปีที่แล้ว

    it is not 2 dimension, it is 3 dimension

  • @spyhunter0066
    @spyhunter0066 2 ปีที่แล้ว

    Can you tell me the diffference between bivariate and multivariate case ? Can you also mention about when the parameters are dependent where we add extra dependence coefficient parameter? There is a sample video to refer for you give a better idea: th-cam.com/video/Ehm0mclZs54/w-d-xo.html

    • @AlexanderIhler
      @AlexanderIhler 2 ปีที่แล้ว

      Bivariate = 2 variables; multivariate = more than one variable. So bivariate is a special case, in which the mean is two-dimensional and the covariance is 2x2. Above 2 dimensions it is hard to visualize, so I usually just draw 2D distributions; but the mathematics is exactly the same.

    • @spyhunter0066
      @spyhunter0066 2 ปีที่แล้ว

      @@AlexanderIhler Your initial case of 1D Gaussian with only one x value is indeed a bivariate case with one x value with two parameters,the mean and the sigma value, right? Also, bivariate case can be called the simplest case of multivariate occasion, right? If we have a data set x and a multiple variable of mean and sigmas, we have to use your MULTIVARIATE CASE with a vector of x values and mean values with a covariance matrix for the sigma values, shouldn't we?
      Thanks for the help in advance.

    • @AlexanderIhler
      @AlexanderIhler 2 ปีที่แล้ว

      No, those are the parameters; if “x” (the random variable) is scalar, it is univariate, although the distribution may have any number of parameters. So, if x is bivariate, x=[x1,x2], the mean will have 2 entries and the covariance 4 (3 free parameters, since it is symmetric), so the distribution has 5 parameters total.

    • @spyhunter0066
      @spyhunter0066 2 ปีที่แล้ว +1

      @@AlexanderIhler x is your data point, right! If it is only one scalar value, the case is called univariate case, but if it is a vector of scalar values of two, it is called bivariate by definition. That's it. For bivariate and multivariate case where the data x variable is a vector of size d, the mean is also a vector of the same size of x vector. Thus, the covariance matrix by definition the square matrix has to have d by d matrix if x and mean has d dimension as you said . I assume you said 5 parameters in total, because symmetric terms are equal in covariance matrix, so 4-1=3 parameters coming from that Sigma matrix with size d x d .

  • @fupopanda
    @fupopanda 4 ปีที่แล้ว

    Too many mistakes in the slides. But otherwise good explanation.

  • @joschk8331
    @joschk8331 6 ปีที่แล้ว +1

    the video is great but your audio sucks. buy an adequate microphone

    • @jfrohlich
      @jfrohlich 5 ปีที่แล้ว +6

      I can understand everything he's saying just fine.