You are awesome! I am very grateful for your work! I was wondering, is it worth doing batching in a 3D renderer? Or maybe it will turn out to be a useless feature (especially with different assimp optimizations)... It seems to me that this would be useful in a game with low-poly like graphics, but certainly not in a physically based renderer or something like that, am I right?
Thanks! Batching in 3D is usually beneficial because the GPU is (depending on your $$$) very fast and you don't want it to sit idle. You need to always feed it with work so you want to minimize the CPU-to-GPU interaction and batching is one way to do it.
Can't you create a struct containing the releavant quad info (base pos etc.) and then create an array of struct with size MAX_QUADS? struct T { vec2 BasePos; vec2 WidthHeight; ... }; uniform T QuadInfo[MAX_QUADS];
I haven't tried that but it should work. We are getting the offset of each element in T using glGetActiveUniformBlockiv. If you do this my way (structure of arrays) this gives you the offset to the corresponding array within T so all you need to do is to access the specific element using the offset as a base pointer: T.BasePos[i]. If you do this your way (array of structures) then you should get the offset inside a single struct. You then need to access the specific T element and use the offset to find the attribute inside it. The assumption is that the size of T is the sum of its attributes and that everything is packed together. It makes sense but I haven't studied the spec enough to verify it.
Slightly unrelated, instead of using 6 vertices to draw the quad you could also use only 4 vertices with an element buffer object right? Does this make performance better?
Sure, but I doubt whether it will perform better due to the low number of vertices. So I usually skip the element buffer in such cases. btw, you can achieve 4 vertices without an element buffer using GL_TRIANGLE_STRIP.
I'm still trying to figure out the vertex buffer, index buffer, and the vao thing. How do I draw multiple different models? Do I have to declare a different vertice and indice buffer for each object or do I declare all models in one vertex buffer and one index buffer? And the vao: what the f is this? I'm still confused...
You can manage your models in different ways and you basically mentioned the two main strategies. Option #1 is to use a dedicated VB/IB for each model. This seems more straightforward and I guess many people will go with that option because it makes sense to wrap the model in a C++ class and keep the buffers as private attributes. Option #2 is to load all the vertices and indices of all the models into a single VB or VB/IB pair and draw a specific model using glDrawElementsBaseVertex or any similar draw call that allows you to draw from the middle of the buffer. Whatever works better for you. The idea behind the VAO is to minimize the amount of state changes that you need to take care of before a draw call - you need to enable all the active vertex attributes and set the layout for each one. You can setup all this state once inside the VAO so that every time you bind the VAO you don't need to worry about all this state changes.
Excellent question! I haven't tried that but I guess that in general you need to apply a 2D rotation matrix in the vertex shader. The trick is that if you just apply it on the existing code the sprites will rotate around the center of the screen instead of around their center (which is what I guess you want to do). You need to translate them to the center of the screen, apply the rotation and then translate back and continue. The vertex buffer provides us with corners of the quad: (0,0), (0,1), (1,0) and (1,1). When we multiply it by the WidthHeight vector which is in NDC we get the location in NDC of each corner. You need to translate each corner by half the width and height (in the negative). This moves the sprite to the center. Next you need to apply the 2D rotation matrix which will be a new uniform. And then translate back using the positive half of WidthHeight. From there you can continue as usual.
I think you can handle it adding vertices to vertexbuffer rotated via a transform matrix and drawing with element buffer preloaded on start with all indexes for drawing the maximum of quads per batch, and for each batch call gldrawelements with the number of sprites to draw. And in vertex shader you receive attribs of position,uv, etc and simply multiply by viewprojection matrix, and output gl_position to fragment shader and the other attribs. Im doing this in personal projects and with 2 sprites rotating i get 7000fps in C# and with 30k sprites i get about 300fps
Is there a simple way to add a uniform to the frag shader so that we can determine the offset of the spritesheet? Your set up is quite different from the opengl tutorial I'm following, so I'm curious if it's a simple task or not.
What do you mean 'offset of the spritesheet'? An offset inside it? In general, any value that you can calculate before the draw call can be set into the fragment shader as a regular uniform.
@@OGLDEV I define the vertices in an array with values being 1.0 and 0.0, I want to begin rendering at a specific part of the atlas, like we do with tilemaps, but the only solution I've gotten to work (which isn't a solution) was to call in c++ glPixelStorei(GL_UNPACK_ROW_LENGTH, width); glPixelStorei(GL_UNPACK_SKIP_PIXELS, currentFrameX); glPixelStorei(GL_UNPACK_SKIP_ROWS, currentFrameY); and then after calling glTexImage2D, calling those same functions, and passing in 0 as an argument. This is definitely not ideal lol
If I understand this correctly you are using glPixelStorei to point to the correct sprite. My intuition is that using the uniforms and not touching the texture itself is better but I can't prove it or anything.
@@OGLDEV I actually just figured out that in the vertex array, the texture coord values just needed to be changed from 0 to like 0.3 etc to begin the coords at a different spot. Now I need to figure out how to calculate screen coords to these decimals. It was confusing how the text coords and positions essentially alternate in the array.
That's what I do in the tutorial. 'screen coords' - you mean where to finally render the sprite? I translate screen position to NDC and provide it to the vs.
The main advantage with instancing comes when you have a very large model that you want to render multiple times while changing small bits and pieces such as matrices, etc. In the case of a single quad it seems a bit redundant but I have to admit that I haven't measured the two options so I can't say which will perform better.
3 Time of each color. 3 timer for each dimensions Each dimensions, each sprite had square and circle , and olygon and x an o super imposed onwith tecture of each and siper computer os can do fast only at location of display
Awesome, great video!
Thanks!
You hero among men.
LOL! Thanks!
Keep the hard work chief
Sure thing Boss ;-)
Very good info and explanation! thanks
You're welcome :-)
You're the best !!!!! thanks
Thank you so much!
Awesome videos!
Thanks!
You are awesome! I am very grateful for your work! I was wondering, is it worth doing batching in a 3D renderer? Or maybe it will turn out to be a useless feature (especially with different assimp optimizations)... It seems to me that this would be useful in a game with low-poly like graphics, but certainly not in a physically based renderer or something like that, am I right?
Thanks!
Batching in 3D is usually beneficial because the GPU is (depending on your $$$) very fast and you don't want it to sit idle. You need to always feed it with work so you want to minimize the CPU-to-GPU interaction and batching is one way to do it.
Thanks a lot. Btw have you ever consider making a video about OIT ?
Yes, but will require some research because I'm not very familiar with the topic. Will add it to the todo list.
Can't you create a struct containing the releavant quad info (base pos etc.) and then create an array of struct with size MAX_QUADS?
struct T {
vec2 BasePos;
vec2 WidthHeight;
...
};
uniform T QuadInfo[MAX_QUADS];
I haven't tried that but it should work. We are getting the offset of each element in T using glGetActiveUniformBlockiv. If you do this my way (structure of arrays) this gives you the offset to the corresponding array within T so all you need to do is to access the specific element using the offset as a base pointer: T.BasePos[i]. If you do this your way (array of structures) then you should get the offset inside a single struct. You then need to access the specific T element and use the offset to find the attribute inside it. The assumption is that the size of T is the sum of its attributes and that everything is packed together. It makes sense but I haven't studied the spec enough to verify it.
Awesome.
Thanks :-)
Slightly unrelated, instead of using 6 vertices to draw the quad you could also use only 4 vertices with an element buffer object right? Does this make performance better?
Sure, but I doubt whether it will perform better due to the low number of vertices. So I usually skip the element buffer in such cases. btw, you can achieve 4 vertices without an element buffer using GL_TRIANGLE_STRIP.
I'm still trying to figure out the vertex buffer, index buffer, and the vao thing. How do I draw multiple different models? Do I have to declare a different vertice and indice buffer for each object or do I declare all models in one vertex buffer and one index buffer? And the vao: what the f is this? I'm still confused...
You can manage your models in different ways and you basically mentioned the two main strategies. Option #1 is to use a dedicated VB/IB for each model. This seems more straightforward and I guess many people will go with that option because it makes sense to wrap the model in a C++ class and keep the buffers as private attributes. Option #2 is to load all the vertices and indices of all the models into a single VB or VB/IB pair and draw a specific model using glDrawElementsBaseVertex or any similar draw call that allows you to draw from the middle of the buffer. Whatever works better for you. The idea behind the VAO is to minimize the amount of state changes that you need to take care of before a draw call - you need to enable all the active vertex attributes and set the layout for each one. You can setup all this state once inside the VAO so that every time you bind the VAO you don't need to worry about all this state changes.
@@OGLDEV Thank you!
Did you write another comment? Seems like youtube deleted it.
@@OGLDEV TH-cam sadly deletes lots of comments
Very true unfortunately.
many thanks!
You're welcome!
How would you handle rotation of each individual sprite in this scenario?
Excellent question! I haven't tried that but I guess that in general you need to apply a 2D rotation matrix in the vertex shader. The trick is that if you just apply it on the existing code the sprites will rotate around the center of the screen instead of around their center (which is what I guess you want to do). You need to translate them to the center of the screen, apply the rotation and then translate back and continue. The vertex buffer provides us with corners of the quad: (0,0), (0,1), (1,0) and (1,1). When we multiply it by the WidthHeight vector which is in NDC we get the location in NDC of each corner. You need to translate each corner by half the width and height (in the negative). This moves the sprite to the center. Next you need to apply the 2D rotation matrix which will be a new uniform. And then translate back using the positive half of WidthHeight. From there you can continue as usual.
@@OGLDEV That is a very clever take, but wouldn't that rotate all vertices (from all sprites)?
@@razcodes The rotation angle needs to be a per sprite uniform, similar to position, etc. You will construct a different matrix for each sprite.
I think you can handle it adding vertices to vertexbuffer rotated via a transform matrix and drawing with element buffer preloaded on start with all indexes for drawing the maximum of quads per batch, and for each batch call gldrawelements with the number of sprites to draw.
And in vertex shader you receive attribs of position,uv, etc and simply multiply by viewprojection matrix, and output gl_position to fragment shader and the other attribs.
Im doing this in personal projects and with 2 sprites rotating i get 7000fps in C# and with 30k sprites i get about 300fps
thanks a lot
Your'e welcome :-)
Is there a simple way to add a uniform to the frag shader so that we can determine the offset of the spritesheet? Your set up is quite different from the opengl tutorial I'm following, so I'm curious if it's a simple task or not.
What do you mean 'offset of the spritesheet'? An offset inside it? In general, any value that you can calculate before the draw call can be set into the fragment shader as a regular uniform.
@@OGLDEV I define the vertices in an array with values being 1.0 and 0.0, I want to begin rendering at a specific part of the atlas, like we do with tilemaps, but the only solution I've gotten to work (which isn't a solution) was to call in c++ glPixelStorei(GL_UNPACK_ROW_LENGTH, width);
glPixelStorei(GL_UNPACK_SKIP_PIXELS, currentFrameX);
glPixelStorei(GL_UNPACK_SKIP_ROWS, currentFrameY);
and then after calling glTexImage2D, calling those same functions, and passing in 0 as an argument. This is definitely not ideal lol
If I understand this correctly you are using glPixelStorei to point to the correct sprite. My intuition is that using the uniforms and not touching the texture itself is better but I can't prove it or anything.
@@OGLDEV I actually just figured out that in the vertex array, the texture coord values just needed to be changed from 0 to like 0.3 etc to begin the coords at a different spot. Now I need to figure out how to calculate screen coords to these decimals. It was confusing how the text coords and positions essentially alternate in the array.
That's what I do in the tutorial. 'screen coords' - you mean where to finally render the sprite? I translate screen position to NDC and provide it to the vs.
Multiple identical quads is a simple sollution tho you can also use only one with very simple modifications
You mean with instancing?
@@OGLDEV yes
The main advantage with instancing comes when you have a very large model that you want to render multiple times while changing small bits and pieces such as matrices, etc. In the case of a single quad it seems a bit redundant but I have to admit that I haven't measured the two options so I can't say which will perform better.
3 Time of each color. 3 timer for each dimensions Each dimensions, each sprite had square and circle , and olygon and x an o super imposed onwith tecture of each and siper computer os can do fast only at location of display
Sorry, I didn't understand this comment.
Start from 1 so need no -1
Do you mean the screen position?