This was *exactly* what I was looking for! Most tutorials are either way too simplistic, are focused on using specific game engines, or are so incredibly advanced that you need a PhD level to understand them. Your tutorial hit that exact sweet spot in the middle, where you get complete, detailed info, but can still understand it without already having ten years of experience. This vid made me subscribe.
To those still scratching their heads after watching the end of this video and putting the perspective divide in your [vertex] shader here is the conclusion he left out: The output of the vertex shader is expected to be in Clip Space -- not NDC space (that is, with the projection matrix applied to the camera space vec4 but without perspective divide). Then, AFTER the fragment shader has run and the depth of each fragment has been determined the perspective divide is handled implicitly by the shader pipeline by dividing the values in the depth buffer by the w value stored in the output of the vertex stage. Also, contrary to his statements, both modern OpenGL and DirectX assume Clip Space to be a left-handed coordinate system. Vulkan assumes Clip Space to be right handed (with the positive Y axis facing down). Rather, OpenGL's Clip Space expects the near and far plane to be mapped into the -1, 1 range whereas DirectX and Vulkan expect the near and far plane to be mapped into the 0, 1 range.
I was super lucky to find this channel. After picking up some C/C++ and Linear algebra basics will definitively proceed a paid course about CG, Game engine and Raycasting to finally start my computer graphics journey after 20 years of procrastination :)
Is this a joke, I just started learning rendering stuff and you just posted about the subject I was searching for. Your channel is a gem keep up the good work and thank you soo much for this quality content!
Thank you for the great video! By the way, congratulations for completing the new physics tutorial. Another one that I really want to do! I can imagine that the last few weeks have been tough. It is often like that on the last mile of a project, right? Do you already have an idea of what topic you will do next?
I honestly love these explanations. Maybe i am a mongrel, but not using fancy words before explaining them works for me. I dont know why people cant do that normally 😊
@@pikuma Hello. Thanks for answering back. I'm having a bit of an issue that I can't really solve though. Normally the camera in view space is oriented to face towards the -Z direction. I have a near clipping plane of 1 and a far clipping plane of 100. This would mean that all the points that should fit inside the clip box should have a z value between -1 and -100 since I'm facing the -Z direction. For some reason tho it's reversed so only the values between 1 and 100 are inside the clipping box (have a final z value between 0 and 1). I just figured out how to fix this while typing the comment but I'm going to post it anyway because it might help someone in the future. I fixed it by using negative values for my near and far z planes. So instead of Znear being 1 it's equal to -1 and likewise Zfar is equal to -100 instead of 100. I have no idea whether it's normal to use negative values for clipping planes but that's what I've done to fix my issue. Edit: DON'T DO THIS. The x and y components are going to flip sign (negative becomes positive and vise verca)
It seems the calculation you have done for perspective projection martix assumes Z from 0 to 1 (to get Zf/(Zf-Zn) and -ZfZn/(Zf-Zn)) and not -1 to 1. for -1 to 1 the values would calculate to be (Zf+Zn)/(Zf-Zn) and -2ZfZn/(Zf-Zn).
I wonder how I could apply this with regards to VR and canted or non-parallel displays, as those view matrices are completely different both from a flat screen and each eye. Like for instance, in the left eye m20 is the inverse of m02 and m00 is the same value as m22. This is extremely hard to figure out
Hm, are you sure it's not simply a different handiness or system (like OpenGL) that achieves something similar but using different matrix entries? Can you kindly point out the resource you're using? 🙂
@Pikuma Please correct me if I'm wrong, but if you define aspect as height / width, then you don't need to inverse fov formula. I.e. when aspect = h /w, then fov = tan (angle / 2). Inversion is needed if aspect = width / height.
Hi @AmarelSWTOR. That's a good question. I thought the inversion was needed to correctly set the FOV angle as "inversely proportional" to how we scale the screen x and y. The FOV I'm using in the code is the vertical FOV (= h/w).
19:40 I'm not sure why but I don't understand the yellow part with the minus sign. Is it multiplied by w aswell or is it just incremented without multiplication? (the negative value to the z when doing x*0+y*0+z*(far/(dist))+(-(far/(dist))) or is it x*0+y*0+z*(far/(dist))+W*(-(far/(dist)))?)
Good question. So, the element [3,3] of the matrix multiplies Z, and the element [3,4] of the matrix multiplies by 1 but subtracts from the previous value of [3,3]*Z. It is: (zfar/(zfar-znear)) * Z - (znear*(zfar/(zfar-znear))) * 1 All this will be stored in the final vector Z component.
@@pikuma Oh thanks for explanation. I'm learning this because I'm trying to render things without OpenGL or anything. I'm still having problems with transforming Vector in 3D space to normalized Vector of the screen.
@@SkrovnoCZ Hm, I see. I believe OpenGL uses a different perapective projection matrix than the one I mentioned. All the same tasks are still basically the same, but the handedness and the final normalization is a bit different. For a breakdown of OpenGL "way of doing projectio ", this website is great: www.songho.ca/opengl/gl_projectionmatrix.html
@@pikuma Thanks. I'm not using OpenGL because I don't understand it. I'm just doing printf() of a prepared string which will output 3D shapes. You are explainig it great btw. Do you also have a video about "Clipping" in image space? (when a triangle is on the borders of the image space then 3 points of that triangle become 4 which will result in cutting the remainig part out of bounds so the triangle will become a rectangle?)
@@SkrovnoCZ Sure. All the pipeline is covered in our lectures at pikuma.com, including clipping. Although we do frustum clipping in world space in our code.
Please correct if i am wrong: the normalization of z to 0 and 1 considering that what we are seing is between znear and zfar should be z' = (z-znear)/(zfar-znear), is not that?
I understand this, however i still don't get how you pass from 3d coordenates to 2d, since the screen is actually a 2d raster, once you have the 3d coordenates, how do you represent the points in 2d? i'm stuck with this. Great videos btw, by far the best ones i've seen about computer graphics.
After the perspective-divide (where we divide both x's and y's by z), we simply plot a pixel at point (x,y) on the screen. It's almost likecwe forget there was a z, and we draw the x & y on the 2D screen.
Hi Sir, I have on question - So, Extra added fourth column and row in our matrix is it for translation or perspective view?? or there is two separate matrix for projection and translation? can you please give me more information about this. And thank you so much for lecture I found your content best among all the available content related to graphics
The fourth column is cheating, it’s a mathy way to add a value to a set of matrixes multiplication, this is known as homogeneous matrixes. If you ask why on earth do we need to add a matrix to a set of matrixes multiplication, well this is something to do with affine transformation (rotation, scale matrixes combined as multiplication because they are linear transformation BUT translation can not be added as multiplication because it is not linear so it have to be added to them as addition)
i need understand somethings: 1 - after the projection, i can use the X and Y normaly without Z? 2 - seen the matrix projection function, the parameters, the 'aspect' is the H/W... understand.... but what is the best for 'fov', ZNear and ZFar values?
Hi Joaquim. After the projection we usually render points (x,y) on the screen. That works ok, but in reality GPUs store the old z (depth) value with the projected point because it is useful for certain computations later (like texture mapping our polygons on the screen considering perspective). We need the z (actually 1/z) for that! Znear and Zfar you can manually choose for your game as you want. Some games use a very big Zfar, while others use a very small Zfar (and we can see less objects at the distance). Back in the day of slow machines we used small Zfar values (clipping objects to improve CPU performance). Sometimes we even added a *fog* effevt to mask that visible/aggressive far clipping.
@@joaquimjesus6134 Same thing, some games like 60 degrees, other games use 90 degrees... it depends on what's the angle of opening you want and how many objects you want to see inside your 'field of view' in your game. There's no correct answer. 🙂 Abraços!
Hi krystof. They define what is visible in terms of depth. What is the closest and the furthest z value we will consider for the projection in the screen? Everything outside znear and zfar we won't consider. Another goal of the near and far planes is when we clip (see my video about the stages of the graphics pipeline), so we don't try to project verrices that are too close to the eye-point. If we try to project and divide by zero (division by z in the perspective divide) that would be a problem.
@pikuma looking at your equation, if z = znear, it looks like the outputted z value is going to be 0. From your explanation of NDC, shouldn't it equal -znear?
Could anyone explain where he got the scaling factors for normalizing z from? Totally lost me when it got to zfar/zfar-znear and -zfar*znear/zfar-znear. Great video, its just that I eould like to actually understand where the math comes from and not just copy and paste formulas.
It's essentially a unit conversion. The first term is a scaling factor that will squeeze the numbers between Znear and Zfar to the space between 0 and 1 with an offset. It's offset because the 0's don't line up in the two unit systems. Since Znear will become 0 after normalization we subtract it (after converting it with the scaling factor) from the scaled z. This will shift all the values to between 0 and 1, no offset. An alternative method would be to subtract Znear from z (line up the 0's). Then divide it by Zfar - Znear (scale between 0 and 1, no offset). This method breaks the matrix step though.
@@deroll_sweet What do you mean exactly by it's offset 'because the 0s don't line up with the two unit systems'? I'm trying to figure out how to derive this scale factor but I can't :(
i have a question, how can i determine de angle of a normal inside the projection matrix? i mean, how can i determine if a side of a cube is currently visible to the projection matrix?
Why would you divide `result.z / result.w` at the very end? What is the point of "perspective shrinking" the distance factor? Seems an unnecessary step, particularly if z was already normalized.
I understood alot but ultimately confused. What to do when the camera is not point towards z? How does it work if the camera is pointing towards an object with z = 0?
Even an ability to discern patterns lends little here, in the generalized nature of 3D matrix transformations, perspective and projection. We could just exchange the x and z values thereby looking down the X-Axis. Really though, all has to be learned and understood independently. Less we are subject to hack-bloating, more we are able to design transformation matrices in specific ways for particular computations. The camera itself is a component which is described by a transformation matrix. The order which transformations are applied is very important to boot. Where there is a stage, a camera and an object, there are transformation matrices to describe them. Each to have their own designated scale, rotation and translation. A final transformation matrix is produced by ordered multiplication. Correct order is usually projection * stage * camera * object * vertex, applying first the object scale-rotation-offset matrix to the vertex. Camera zoom-orbit-position followed by stage augmentation-angle-locale, all effected before perspective projection 🖼 The matrix to make the camera look down the X-Axis would contain orbit (rotation) parameters in cells camera[1].yz and camera[2].yz. Per Google AI, "A rotation matrix to look down the x-axis in 3D is: [1 0 0; 0 cos(θ) -sin(θ); 0 sin(θ) cos(θ)]". Interpretation to viably incorporate takes a lot of discernment, which might come from taking to 2D. Where there is no 'camera', it is invisibly defined as the identity matrix.
Because it is a ratio between the x and y axis, specifically the width and the height which means one of the them is the base which always will be 1.0 or 100% whereas the other will have a percentage that based on the aforementioned base. For instance, a screen with 500 pixels height and 1000 pixels width will have an aspect ratio of 0.5:1. The reason why you only multiply x and not y is because once again it is a ratio, if you multiply both of them means nothing has change and this is not we wanted. Assuming that an object originally comes from a space that is considered square, the distribution of the values across x-axis and y-axis are equal but that is not the case in the screen space because its width and height are not equal. For example, a square with one of its vector as 0.5x, 0.5y if converted onto the screen space without multiplying the aspect ratio, what will happens? 0.5y = 50% of the height and 0.5x = 50% of the width and 50% of 500pixels and 50% of 1000pixels are clearly not the same. However if you now multiply x with 0.5 (the aspect ratio that we just calculated), 0.5x0.5 = 0.25, and 25% of 1000 pixels is indeed equal to 50% of 500pixels thus the square is now rendered correctly on the screen space
I need to understand something. The projection matrix receives the vertices of the world objects already normalized, or the matrix takes care of normalizing them in ranges from one to minus -1.
The projection matrix receives the values as they are in world space (not normalized), and the normalization of x's, y's, and z's (between -1 and 1) happens as we multiply the proj.matrix and also after the perspective divide (which i performs the division by w).
@@pikuma When it refers to world values, it refers to values that are outside the range -1 and 1 for example I can put an arbitrary value for a vertize, maybe (5.0,2.0,3.0), then the matrix will take care of normalizing it so that are within that range (1 -1).
@@pikuma The last question. If I provide one of my vertices with a z coordinate of (60.0) and my "zfar" is 20.0, will it not be seen on the screen? . Thank you very much for responding.
@@stevenriofrio7963 There's a little more to it, and it involves something called clipping. There is a stage where we clip all the triangles to only have objects inside the view frustum. The clipping happens at the top, bottom, left, right, and also the near and far planes. That's why vertices outside znear and zfar get discarded (clipped out of our final view) and we only render objects inside the view (between -1 and 1).
have have seen these video sometimes and i will see it more. theres 1 thing that you don't speak: do i need convert Degrees to Radiians? the computer don't use Radians instead Degrees?
Most graphics frameworks expect values to be in radians. Degreesxare only used to display or input angles from the user via UI. In programming, it's usually all done in radians.
Eyes on the prize! This guy is a national pride haushs I was wondering, isn't the following matrix correct? projectionMatrix = [ [aspectRatio * FOV, 0, 0, 0], [0, FOV, 0, 0], [0, 0, lambda, 1 ], [0, 0, -lambdaOffset, 0],
] since we have to subtract the lambdaOffset for the Z component, wouldnt it be better if it was in the 3rd column? (, '-')a
I’m embarrassed, but why do we multiply the x component by the whole aspect ratio, instead of multiplying x by screen width and y by screen height? The unadjusted screen is a unit square, and we’re just stretching the square to fit the (rectangular) monitor. What am I missing?
Because it's a ratio. Like 1.5 to 1 The y is always multiplied by 1 and the x by 1.5. Because it's always a something to one we drop the 'to one' part.
Hi there. Did I make a mistake? So, for example, if we have a resolution of 800x600, the factor that we are looking for to multiply our x component is 0.75. That is h/w, no?
Maybe I m wrong, Ur eyes or not on top of the screen Ur eyes are behind the screen and there is some distance between U and the screen as well, what I believe is that we R trying to place the objects as if Ur eyes are on the screen rather than u sitting on a chair, it would be not same like for example in an first person shooter games the player is u and what Ur seeing in the world is as if Ur seeing in that world and not behind the screen is what is done is what I believe, zfar is in terms of zfar from the perspective as if Ur eyes is on the screen not zfar Ur from the chair, and place the objects as if Ur eyes R on the screen not when ur sitting on the chair n looking at it, it's more realistic if ur eyes are on the screen rather than from u sitting on a chair
Hi I’m stuck at this, could anyone help me find out where am I wrong please? . Following the z normalize fomular, let zNear=5, zFar=15, Z is the depth of the vertex, I have: . [(15xZ)/(15-5)] - [(15x5)/(15-5)]/Z . If Z=10 I got the result=0.75?? It should be 0.5 I think?? Does anyone know?
I don't understand how just dividing by w you can normalize everything. If I have this vertex { -2.0, 0.0, 0.5 } and my projection parameters are fov = 60, aspect = 1.0, Znear = 0.1 and ZFar = 1000, I get w = 0.5. Projected X will be -3.46 and obviously, if you divide -3.46 by 0.5 it's the same as doubling it.
The meaning of normalize is lost when nobody knows what it means. "It means this, it means that". AI doesn't even know. Regardless, original z (perspective scaling factor) is swapped to w by this matrix, and z is altered so that the depth information is retained when z later gets divided by w. GPU vertex shaders automatically divide all components by w as a final unseen stage before passing the position to a fragment shader. Its function as inverse multiplication is to approach infinity as w gets closer to zero. Also, the most accurate floating-point numbers computationally are between -1.0 and 1.0. Feel free to take normalize out of vocabulary ⛔
You can pick them yoursef. Some games use a FOV of 60°, others 50°, etc. Znear and Zfar the same thing. Some games have a znear of 100, others 1000. It's up to you, the programmer.
My understanding is that this z "normalization" will happen after the perspective divide, placing the z values between 0 and 1 (in front of us in a left-handed system). Or simply -1 and 1 in most APIs.
@@aprile1710 I was able to build an engine without this part. I use the following matrix below, then I perform perspective division. Works just fine without this step. (See link below) // Perspective Projection Matrix float persp[4][4] = { {aspect * 1/tan(fov/2), 0, 0, 0}, {0, 1/tan(fov/2), 0, 0}, {0, 0, 1, 0}, {0, 0, -1, 0} }; th-cam.com/video/IO9sT3t2fSc/w-d-xo.html&ab_channel=AlexFish
This was *exactly* what I was looking for! Most tutorials are either way too simplistic, are focused on using specific game engines, or are so incredibly advanced that you need a PhD level to understand them. Your tutorial hit that exact sweet spot in the middle, where you get complete, detailed info, but can still understand it without already having ten years of experience. This vid made me subscribe.
To those still scratching their heads after watching the end of this video and putting the perspective divide in your [vertex] shader here is the conclusion he left out:
The output of the vertex shader is expected to be in Clip Space -- not NDC space (that is, with the projection matrix applied to the camera space vec4 but without perspective divide). Then, AFTER the fragment shader has run and the depth of each fragment has been determined the perspective divide is handled implicitly by the shader pipeline by dividing the values in the depth buffer by the w value stored in the output of the vertex stage.
Also, contrary to his statements, both modern OpenGL and DirectX assume Clip Space to be a left-handed coordinate system. Vulkan assumes Clip Space to be right handed (with the positive Y axis facing down). Rather, OpenGL's Clip Space expects the near and far plane to be mapped into the -1, 1 range whereas DirectX and Vulkan expect the near and far plane to be mapped into the 0, 1 range.
@@3rdGen-Media Thsnk you for the great addition.
Thats the best video i have seen in the last months i subscribed to your channel and liked the video keep it up you did a great job
I was super lucky to find this channel. After picking up some C/C++ and Linear algebra basics will definitively proceed a paid course about CG, Game engine and Raycasting to finally start my computer graphics journey after 20 years of procrastination :)
How did it go eventually?
@@hangyboi i am in progress :) not super focused last half of the year but now i am more than sure that these are brilliant courses.
@@andreypopov6166 удачи тебе, и никогда не сдавайся!
This is the best video I've seen explaining how this perspective projection matrix works. Thank you so much! 😊
That feeling when you find that one video that answers all of your questions👍👌👌👌 thanks for the amazing explanation
Is this a joke, I just started learning rendering stuff and you just posted about the subject I was searching for. Your channel is a gem keep up the good work and thank you soo much for this quality content!
@21:43 are you using the Math3D library by Stephan Soller?
No. I wasn't aware of this library before your comment. This is a snippet of the code we wrote in our 3D programming module.
I assume the code might look similar given that all matrix entries are similar anyway.
All I can say after watching this video is awesome. You are the best by far compared to other videos on this topic.
Thank you for the great video! By the way, congratulations for completing the new physics tutorial. Another one that I really want to do!
I can imagine that the last few weeks have been tough. It is often like that on the last mile of a project, right?
Do you already have an idea of what topic you will do next?
Took me weeks of learning this in school and it's summarized into one short video. Great job
I honestly love these explanations.
Maybe i am a mongrel, but not using fancy words before explaining them works for me. I dont know why people cant do that normally 😊
Thank you so much my man!!!! The first video to explain it in an easy to understand way.
Excellent explanation! Very well done and paced. I learned a lot!
What a fluid explanation! Thank you for these lectures.
Your channel is amazing, muito obrigada!!
You deserve 1 trillion views.
Very nice and detailed explanation. Thank you very much!
Wonderfully explained. Clear and thorough
14:27 In this part you keep switching between whether the range is 0 to 1 or the range -1 to 1
Hi. I think I meant "0 to 1" for values that are in front of us.
@@pikuma Hello. Thanks for answering back. I'm having a bit of an issue that I can't really solve though. Normally the camera in view space is oriented to face towards the -Z direction. I have a near clipping plane of 1 and a far clipping plane of 100. This would mean that all the points that should fit inside the clip box should have a z value between -1 and -100 since I'm facing the -Z direction. For some reason tho it's reversed so only the values between 1 and 100 are inside the clipping box (have a final z value between 0 and 1).
I just figured out how to fix this while typing the comment but I'm going to post it anyway because it might help someone in the future. I fixed it by using negative values for my near and far z planes. So instead of Znear being 1 it's equal to -1 and likewise Zfar is equal to -100 instead of 100. I have no idea whether it's normal to use negative values for clipping planes but that's what I've done to fix my issue.
Edit: DON'T DO THIS. The x and y components are going to flip sign (negative becomes positive and vise verca)
great explanation. I was learning shader stuff and it helped a lot
I think you dropped this man 👑
I'm already saving for the 2D Physics game course :)
Just saying, I'm enjoying the Raycasting course while at it
No rush. 🙂 Enjoy every minute of it!
It seems the calculation you have done for perspective projection martix assumes Z from 0 to 1 (to get Zf/(Zf-Zn) and -ZfZn/(Zf-Zn)) and not -1 to 1. for -1 to 1 the values would calculate to be (Zf+Zn)/(Zf-Zn) and -2ZfZn/(Zf-Zn).
this video is just something else, it explains so well a really complex thing.
this is the best video for beginner
I wonder how I could apply this with regards to VR and canted or non-parallel displays, as those view matrices are completely different both from a flat screen and each eye. Like for instance, in the left eye m20 is the inverse of m02 and m00 is the same value as m22. This is extremely hard to figure out
Hm, are you sure it's not simply a different handiness or system (like OpenGL) that achieves something similar but using different matrix entries?
Can you kindly point out the resource you're using? 🙂
An amazing explanation!
thanks so much, this was very useful and handy for me
@Pikuma Please correct me if I'm wrong, but if you define aspect as height / width, then you don't need to inverse fov formula. I.e. when aspect = h /w, then fov = tan (angle / 2). Inversion is needed if aspect = width / height.
Hi @AmarelSWTOR. That's a good question. I thought the inversion was needed to correctly set the FOV angle as "inversely proportional" to how we scale the screen x and y. The FOV I'm using in the code is the vertical FOV (= h/w).
Is this true?
@@Dannnneh it's true
19:40 I'm not sure why but I don't understand the yellow part with the minus sign. Is it multiplied by w aswell or is it just incremented without multiplication?
(the negative value to the z when doing x*0+y*0+z*(far/(dist))+(-(far/(dist))) or is it x*0+y*0+z*(far/(dist))+W*(-(far/(dist)))?)
Good question. So, the element [3,3] of the matrix multiplies Z, and the element [3,4] of the matrix multiplies by 1 but subtracts from the previous value of [3,3]*Z.
It is:
(zfar/(zfar-znear)) * Z - (znear*(zfar/(zfar-znear))) * 1
All this will be stored in the final vector Z component.
@@pikuma Oh thanks for explanation. I'm learning this because I'm trying to render things without OpenGL or anything.
I'm still having problems with transforming Vector in 3D space to normalized Vector of the screen.
@@SkrovnoCZ Hm, I see. I believe OpenGL uses a different perapective projection matrix than the one I mentioned. All the same tasks are still basically the same, but the handedness and the final normalization is a bit different.
For a breakdown of OpenGL "way of doing projectio ", this website is great:
www.songho.ca/opengl/gl_projectionmatrix.html
@@pikuma Thanks. I'm not using OpenGL because I don't understand it. I'm just doing printf() of a prepared string which will output 3D shapes.
You are explainig it great btw.
Do you also have a video about "Clipping" in image space? (when a triangle is on the borders of the image space then 3 points of that triangle become 4 which will result in cutting the remainig part out of bounds so the triangle will become a rectangle?)
@@SkrovnoCZ Sure. All the pipeline is covered in our lectures at pikuma.com, including clipping. Although we do frustum clipping in world space in our code.
Fuck yesss, I have yet to watch but I'm glad to see this!
Please correct if i am wrong: the normalization of z to 0 and 1 considering that what we are seing is between znear and zfar should be z' = (z-znear)/(zfar-znear), is not that?
Thank you, this is inspiring
I understand this, however i still don't get how you pass from 3d coordenates to 2d, since the screen is actually a 2d raster, once you have the 3d coordenates, how do you represent the points in 2d? i'm stuck with this. Great videos btw, by far the best ones i've seen about computer graphics.
After the perspective-divide (where we divide both x's and y's by z), we simply plot a pixel at point (x,y) on the screen. It's almost likecwe forget there was a z, and we draw the x & y on the 2D screen.
@@pikuma Thanks for the response! this actually clarified a lot for me.
Thank you!
Is there any logic/intuition behind the lambda expression? I'm looking everywhere for derivation of that, but can't find any info.
Did you ever figure it out? I need it too lol
you find it?
@@onetimeplace198 this video helped a lot by explaining what the overall goal was of the entire projection. th-cam.com/video/U0_ONQQ5ZNM/w-d-xo.html
Great video, thank you so much
Hi Sir, I have on question - So, Extra added fourth column and row in our matrix is it for translation or perspective view?? or there is two separate matrix for projection and translation? can you please give me more information about this. And thank you so much for lecture I found your content best among all the available content related to graphics
The fourth column is cheating, it’s a mathy way to add a value to a set of matrixes multiplication, this is known as homogeneous matrixes. If you ask why on earth do we need to add a matrix to a set of matrixes multiplication, well this is something to do with affine transformation (rotation, scale matrixes combined as multiplication because they are linear transformation BUT translation can not be added as multiplication because it is not linear so it have to be added to them as addition)
15:36 is what I don't get. How is λ derived? I've spent 3 days over this and still don't get it. Everything else is pretty easy
can relate lol
Did you ever figure it out lol (I can't figure it out either)
@@laskdjf3880 i ended up just copying math (or even using math library). Its easy to encapsulate and forget about it
i need understand somethings:
1 - after the projection, i can use the X and Y normaly without Z?
2 - seen the matrix projection function, the parameters, the 'aspect' is the H/W... understand.... but what is the best for 'fov', ZNear and ZFar values?
Hi Joaquim. After the projection we usually render points (x,y) on the screen.
That works ok, but in reality GPUs store the old z (depth) value with the projected point because it is useful for certain computations later (like texture mapping our polygons on the screen considering perspective). We need the z (actually 1/z) for that!
Znear and Zfar you can manually choose for your game as you want. Some games use a very big Zfar, while others use a very small Zfar (and we can see less objects at the distance).
Back in the day of slow machines we used small Zfar values (clipping objects to improve CPU performance). Sometimes we even added a *fog* effevt to mask that visible/aggressive far clipping.
@@pikuma thank so much for all. Do you have a video that uses a projection and dots/lines? I need learn more things ;)
@@joaquimjesus6134 Sure. I have the lectures on 3D graphics at pikuma.com that cover a complete software renderer.
@@pikuma thank you so much. Correct me anotherthing what is the best 'fov'?
@@joaquimjesus6134 Same thing, some games like 60 degrees, other games use 90 degrees... it depends on what's the angle of opening you want and how many objects you want to see inside your 'field of view' in your game.
There's no correct answer. 🙂
Abraços!
I still don't understand why do we need zfar and znear.. what would happen if we didn't use them?
Hi krystof. They define what is visible in terms of depth. What is the closest and the furthest z value we will consider for the projection in the screen? Everything outside znear and zfar we won't consider.
Another goal of the near and far planes is when we clip (see my video about the stages of the graphics pipeline), so we don't try to project verrices that are too close to the eye-point. If we try to project and divide by zero (division by z in the perspective divide) that would be a problem.
@pikuma looking at your equation, if z = znear, it looks like the outputted z value is going to be 0. From your explanation of NDC, shouldn't it equal -znear?
This is really well explained. do you go into more depth in your course "3D computer graphics?"
Sure. We cover a simplified (but complete) pipeline.
Could anyone explain where he got the scaling factors for normalizing z from? Totally lost me when it got to zfar/zfar-znear and -zfar*znear/zfar-znear.
Great video, its just that I eould like to actually understand where the math comes from and not just copy and paste formulas.
It's essentially a unit conversion. The first term is a scaling factor that will squeeze the numbers between Znear and Zfar to the space between 0 and 1 with an offset. It's offset because the 0's don't line up in the two unit systems. Since Znear will become 0 after normalization we subtract it (after converting it with the scaling factor) from the scaled z. This will shift all the values to between 0 and 1, no offset.
An alternative method would be to subtract Znear from z (line up the 0's). Then divide it by Zfar - Znear (scale between 0 and 1, no offset). This method breaks the matrix step though.
@@deroll_sweet What do you mean exactly by it's offset 'because the 0s don't line up with the two unit systems'? I'm trying to figure out how to derive this scale factor but I can't :(
How the fuck did i understand this?@@deroll_sweet
Thanks
great work and great explanation! thank you very much!
i have a question, how can i determine de angle of a normal inside the projection matrix? i mean, how can i determine if a side of a cube is currently visible to the projection matrix?
Glad it was helpful! 🙂
Thank you so much!
1 / tan is cotan, isn't it? Divisions are evil in fast computations.
@@alexfrozen Yes, that's correct. 🙂
Why would you divide `result.z / result.w` at the very end? What is the point of "perspective shrinking" the distance factor? Seems an unnecessary step, particularly if z was already normalized.
The normalization of the z values (value between 0 and 1) happens *after* the perspective divide.
I understood alot but ultimately confused. What to do when the camera is not point towards z? How does it work if the camera is pointing towards an object with z = 0?
Even an ability to discern patterns lends little here, in the generalized nature of 3D matrix transformations, perspective and projection. We could just exchange the x and z values thereby looking down the X-Axis. Really though, all has to be learned and understood independently. Less we are subject to hack-bloating, more we are able to design transformation matrices in specific ways for particular computations.
The camera itself is a component which is described by a transformation matrix. The order which transformations are applied is very important to boot. Where there is a stage, a camera and an object, there are transformation matrices to describe them. Each to have their own designated scale, rotation and translation. A final transformation matrix is produced by ordered multiplication. Correct order is usually projection * stage * camera * object * vertex, applying first the object scale-rotation-offset matrix to the vertex. Camera zoom-orbit-position followed by stage augmentation-angle-locale, all effected before perspective projection 🖼
The matrix to make the camera look down the X-Axis would contain orbit (rotation) parameters in cells camera[1].yz and camera[2].yz. Per Google AI, "A rotation matrix to look down the x-axis in 3D is: [1 0 0; 0 cos(θ) -sin(θ); 0 sin(θ) cos(θ)]". Interpretation to viably incorporate takes a lot of discernment, which might come from taking to 2D. Where there is no 'camera', it is invisibly defined as the identity matrix.
Why do we multiply x by the aspect ratio but not y?
this is exactly what i wonder
Because it is a ratio between the x and y axis, specifically the width and the height which means one of the them is the base which always will be 1.0 or 100% whereas the other will have a percentage that based on the aforementioned base. For instance, a screen with 500 pixels height and 1000 pixels width will have an aspect ratio of 0.5:1. The reason why you only multiply x and not y is because once again it is a ratio, if you multiply both of them means nothing has change and this is not we wanted. Assuming that an object originally comes from a space that is considered square, the distribution of the values across x-axis and y-axis are equal but that is not the case in the screen space because its width and height are not equal. For example, a square with one of its vector as 0.5x, 0.5y if converted onto the screen space without multiplying the aspect ratio, what will happens? 0.5y = 50% of the height and 0.5x = 50% of the width and 50% of 500pixels and 50% of 1000pixels are clearly not the same. However if you now multiply x with 0.5 (the aspect ratio that we just calculated), 0.5x0.5 = 0.25, and 25% of 1000 pixels is indeed equal to 50% of 500pixels thus the square is now rendered correctly on the screen space
Sounds smart
I need to understand something. The projection matrix receives the vertices of the world objects already normalized, or the matrix takes care of normalizing them in ranges from one to minus -1.
The projection matrix receives the values as they are in world space (not normalized), and the normalization of x's, y's, and z's (between -1 and 1) happens as we multiply the proj.matrix and also after the perspective divide (which i
performs the division by w).
@@pikuma When it refers to world values, it refers to values that are outside the range -1 and 1 for example I can put an arbitrary value for a vertize, maybe (5.0,2.0,3.0), then the matrix will take care of normalizing it so that are within that range (1 -1).
@@stevenriofrio7963 Yes, world space is basically any value in the 3D world... (0,0,-3.6), or (-4.5, 5.8, 47.0), etc.
@@pikuma The last question. If I provide one of my vertices with a z coordinate of (60.0) and my "zfar" is 20.0, will it not be seen on the screen? . Thank you very much for responding.
@@stevenriofrio7963 There's a little more to it, and it involves something called clipping. There is a stage where we clip all the triangles to only have objects inside the view frustum. The clipping happens at the top, bottom, left, right, and also the near and far planes. That's why vertices outside znear and zfar get discarded (clipped out of our final view) and we only render objects inside the view (between -1 and 1).
dude, your brazilian is pretty strong, i can tell it by the tone
I'm glad. 🙂
have have seen these video sometimes and i will see it more. theres 1 thing that you don't speak: do i need convert Degrees to Radiians? the computer don't use Radians instead Degrees?
Most graphics frameworks expect values to be in radians. Degreesxare only used to display or input angles from the user via UI. In programming, it's usually all done in radians.
@@pikuma thank you so much for all
this is awesome
Eyes on the prize! This guy is a national pride haushs
I was wondering, isn't the following matrix correct?
projectionMatrix = [
[aspectRatio * FOV, 0, 0, 0],
[0, FOV, 0, 0],
[0, 0, lambda, 1 ],
[0, 0, -lambdaOffset, 0],
]
since we have to subtract the lambdaOffset for the Z component, wouldnt it be better if it was in the 3rd column? (, '-')a
I just spotted the difference!
i was doing the vector[1x4] . [4x4]matrix
you're doing the matrix[4x4] .[4x1] vector
but unfortunately, my rendering is still all messed up
I’m embarrassed, but why do we multiply the x component by the whole aspect ratio, instead of multiplying x by screen width and y by screen height? The unadjusted screen is a unit square, and we’re just stretching the square to fit the (rectangular) monitor. What am I missing?
Because it's a ratio. Like 1.5 to 1 The y is always multiplied by 1 and the x by 1.5. Because it's always a something to one we drop the 'to one' part.
@@neoncyber2001 so like y = 1 and x = 1(x/y). Where y is 1?
pretty sure aspect ratio is w/h. or am i mistaken?
Hi there. Did I make a mistake? So, for example, if we have a resolution of 800x600, the factor that we are looking for to multiply our x component is 0.75. That is h/w, no?
good point. pikuma's response helped me to realize it's fx/a for standard aspect. afx for inverse aspect straightens it out.
Awesome ♥♥
I don't understand anything about normalizing z.
Maybe I m wrong, Ur eyes or not on top of the screen Ur eyes are behind the screen and there is some distance between U and the screen as well, what I believe is that we R trying to place the objects as if Ur eyes are on the screen rather than u sitting on a chair, it would be not same like for example in an first person shooter games the player is u and what Ur seeing in the world is as if Ur seeing in that world and not behind the screen is what is done is what I believe, zfar is in terms of zfar from the perspective as if Ur eyes is on the screen not zfar Ur from the chair, and place the objects as if Ur eyes R on the screen not when ur sitting on the chair n looking at it, it's more realistic if ur eyes are on the screen rather than from u sitting on a chair
Great Tutorial I understood a lot but still my tiny brain can't handle some things.
Hi I’m stuck at this, could anyone help me find out where am I wrong please?
.
Following the z normalize fomular, let zNear=5, zFar=15, Z is the depth of the vertex, I have:
.
[(15xZ)/(15-5)] - [(15x5)/(15-5)]/Z
.
If Z=10 I got the result=0.75?? It should be 0.5 I think?? Does anyone know?
After the multiplication, why dont we divide zFar instead of Z
I don't understand lambda much,why don't we divide zfar÷znear = ratio? But zfar÷(zfar-znear)????
I don't understand how just dividing by w you can normalize everything.
If I have this vertex { -2.0, 0.0, 0.5 } and my projection parameters are fov = 60, aspect = 1.0, Znear = 0.1 and ZFar = 1000, I get w = 0.5. Projected X will be -3.46 and obviously, if you divide -3.46 by 0.5 it's the same as doubling it.
The meaning of normalize is lost when nobody knows what it means. "It means this, it means that". AI doesn't even know. Regardless, original z (perspective scaling factor) is swapped to w by this matrix, and z is altered so that the depth information is retained when z later gets divided by w.
GPU vertex shaders automatically divide all components by w as a final unseen stage before passing the position to a fragment shader. Its function as inverse multiplication is to approach infinity as w gets closer to zero. Also, the most accurate floating-point numbers computationally are between -1.0 and 1.0. Feel free to take normalize out of vocabulary ⛔
What are the values of fov, znear and far? How can I get them?
You can pick them yoursef. Some games use a FOV of 60°, others 50°, etc.
Znear and Zfar the same thing. Some games have a znear of 100, others 1000.
It's up to you, the programmer.
@@pikuma okk got you! Thank u!
I get it now. The bigger the field of view, the bigger the objects 😂
😱 Oh no! Hahaha.
(zFar / zFar-zNear) will never be between 0 and 1... Think about it. (10/10-1) or (100/100-20).
My understanding is that this z "normalization" will happen after the perspective divide, placing the z values between 0 and 1 (in front of us in a left-handed system). Or simply -1 and 1 in most APIs.
@@aprile1710 I was able to build an engine without this part. I use the following matrix below, then I perform perspective division. Works just fine without this step. (See link below)
// Perspective Projection Matrix
float persp[4][4] = {
{aspect * 1/tan(fov/2), 0, 0, 0},
{0, 1/tan(fov/2), 0, 0},
{0, 0, 1, 0},
{0, 0, -1, 0}
};
th-cam.com/video/IO9sT3t2fSc/w-d-xo.html&ab_channel=AlexFish
:D
its a shame he ruined the video by flashing his satan hands all the way through it.
...Excuse me?
Brilliant. Thanks