Optimization Strategy 1: (4:20) --> Construct the vertex in the same place as it will be stored. Rather than constructing it in the current method and then copying it to the vertex location. --> use emplace_back rather than push_back and only pass the parameter list for the constructor Ex. vertices.emplace_back(1, 2, 3); Optimization Strategy 2: (5:55) -->Remember to “know your environment”. --> If you know that you will need an array to store 3 vertex objects, define it as such so that it is not dynamically resized everytime it runs out of space. Ex. First define the vector, then use vertices.reserve(3); (on separate lines)
I really like C++. Many languages will abstract such things away, which can be very useful but I very often come back to C++ because I want that level of control.
I'd like to see a language that aims to do what C++ does but without keeping all the old C legacy stuff. Clean the language up a bit and make it less of a mess. Some of the standard library features should be language features and vice versa... and we'd have a really solid language for developing high performance applications using modern abstraction. Which is what C++ wanted to be from the beginning but got held down with trying to keep compatibility with C.
@@nexusclarum8000 You really should take a look at Rust. Just as close to the metal as C/C++ without all the baggage and some really good approaches to classic problems. It takes some getting used to, especially if you aren't over OOP yet, but I it's well worth it IMHO.
I think the biggest takeaway here is that you should ALWAYS try to reserve before you start pushing back elements into the container. Even if using emplace_back, the elements would be copied over in the case of a reallocation. So always call reserve before starting to push back. Even if you don't know exactly how many elements you will push back, a guesstimate is always better than not calling reserve at all.
The std::vector also has a built in memory allocator that you can implement to get even more optimization. For instance, creating objects which are automatically aligned on the native CPUs cache line. For CPU intensive work, ensuring objects are cache aligned can gain HUGE benefits because the CPU can fetch data for processing more quickly if that data respects the cache line. So yeah, std::vector has plenty of nice features!
Probably late, but do you have any examples of how I could try to ensure cache optimization? Or sources to reference/learn? I've done a very simple test a long time ago and saw the time to complete the test reduce extremely thanks to a basic cache optimization, but not sure how to deal with more complex situations.
Thanks. Helpful. Scott Meyer further elaborates by exploring when it's reasonable to expect emplacement to be better than insertion in his talk: "Scott Meyers - The evolving search for effective C++ - Keynote @ Meeting C++ 2014", at around the 20 minute mark.
5 หลายเดือนก่อน +3
In fact the Cherno forgot an important point. Until 03:07, the Vertex pushed back in the container is an rvalue, and therefore will be moved (not copied), since the move constructor is implicitly defined. Upon redefining the copy constructor however, the implicit move constructor is deleted by the compiler. By wanting to see where the copy constructor is called, we cause our code to copy the Vertex instead of moving it(until we redefine the move constructor ourselves).
What a great video! Loved the way you showed how to optimize a vectors memory footprint. These are the types of details that I have not yet seen in any other c++ videos on TH-cam. Wonderful job. Thanks a lot!
This rules. Thank you for showing clearly how you can look for ways to optimize and for giving such an easy to follow example, and thank you for all your videos in general. Trying really hard to learn C++ on my own and your content is more helpful than the MIT classes that are online. It's easy to follow, easy to digest, assumes that the viewer is competent, isn't condescending at all to people who are new to C++, all while explaining everything clearly.
I asked my uni professor the difference between push_back and emplace_back and he told me they are basically the same, and didn't explain any further. Thank you Cherno
The push_back() method actually has O(1) time complexity. For larger sizes of the array, it will always double the size, or make it proportionally larger by some constant multiplier. So even if you can't reserve the space beforehand, it won't be as bad an issue, it won't cause any exponential time complexity. (It would, if it resized the vector before every insertion.)
I'll make an echo of something that you already said... Get this into your head. YOU ARE THE BEEESSSTTTT. With you, everything is like a walk in the park. Thanks
I like the speed of these talks. I put them in the background and set the speed to about 1.5 or 1.25, but once in a while when it gets to something that causes a double take or that I want to pay specific attention to, I'll back it up a short bit and set the speed to normal. So for people who aren't already familiar, I bet it's about the right speed.
Warning C4244 ('argument' conversion from _Ty to float possible loss of data). changed -> vertices.emplace_back(1, 2, 3); to -> vertices.emplace_back(1.0f, 2.0f, 3.0f);
This was EXACTLY what I needed right now. Am using a global vector where each element is HUGE - in GBs. I was initializing this vector through resize (size, def_val). def_val being huge messes up with the stack. The reserve and emplace_back will help me save copy cost and help me create elements on the heap.
If you need even more control in some resource rich situation, check out std::allocator in the header. It is what vector and all other containers use to manipulate their storage, but you can get an even finer grain of control by using it yourself.
As Ilari Vallivaara has said, the time of copying N elements in a vector is no where near exponential. It's linear. You will find that if you do the amortized analysis which he has explained in his comments. I would like to give a practical proof for it. Lets see how much time it takes to push 100M elements in a vector without reserving the space versus reserving the space. First, without reserving the space. #include #include using namespace std; #define MAX 100000000 int main() { vector subject; for(int i=1; i
In my personal test in which I compared the usage of reserve() and resize(), it turned out that reserve() was indeed slower than resize() and also I had overloaded the new operator and from there I came to know that the reserve() method was allocating twice on the heap whereas the resize() method hit the heap only once. Why is this so? The compiler I used was g++ (MinGW) and compiled using std=c++17.
Personal Notes: - vertices.reserve(44) allocates memory that can hold 44 vertex object -vertices.push_back((1,0,0) and vertices.push_back(Vertex(1,0,0)) does unnecessary copying(and creating) of Vertex object, use vertices.emplace_back(1,0,0) instead not to copy unnecessarily. -I am bored and hurried, don’t trust 100% what I just said
Fun thing that cherno does not mention is that if you set the copy constructor = to delete the code at the end of the video wont compile but why? Turns out emplace back might copy (not always) and so anything you pass into emplace back MUST have a copy constructor (just in case) the only fix I am aware of (that still performs 0 copies) is having a vector of unique ptr's to vertexes instead of a vectors of vertexes
I thought this was going to be a look at how to write a good allocator for a vector, rather than just standard stuff everyone already knows. Writing an allocator for a vector to handle all of the cases it has to handle isn't trivial, this content is.
Why is it 6 then? If the first copy is from copying from the method to the vector, the second and third is from the copying of the second one and the extension of the length. Then wouldn't every subsequent one also just add two to the amount of copies.
I just added two more pushes (vertex(10,11,12) and vertex(13,14,15)) and these are the results when increasing the reserve value: reserve(1): 12 Copies reserve(2): 11 Copies reserve(3): 8 Copies reserve(4): 9 Copies reserve(5): 5 Copies Why is this happening?
Great video thx Cherno, but I got one question, how did we get 6 copies ? First copy because the instance was in the main memory then we had to copied to the heap or the allocate memory from the vector The second and third copies because the main and the resize the fourth ,fifth for same reasons, how about the six copy ?where did it came from? thanks for anyone in advanced whom answered me.
It's easier to think on the push_backs. First push back -> 1 copy from main (and now you have 1 item in the list) Second push back -> 1 copy from main, 1 copy of the previous item (now you have 2 in the list) Third push back -> 1 copy from main, 2 copies because you have 2 items on the list (now you have 3) Until here, you have 6 copies with no optimization was made On one future forth push back -> 1 copy from main, 3 copies fifth push -> 1 copy from main, 4 copies
Wow.. please make more of this kind of optimization videos and I'll make sure that whole of my class in tel aviv university computer science course will be happy to subscribe to your channel!! Im talking about the std:vector video ofc!
After the capacity is 2, shouldn't after reallocation the capacity become 4. Doesn't it doubles every time? Although only 3 elements should be allocated.
The size of the vector is not doubled. At first the vector has a capacity of 1. As one copy constructor is called. After that when we add new vertex to it. The capacity is increased by one. So both vertex are copied to the new vector list. As two copy constructor are called. When we add a third vertex a new vector array is created with 3 size and 3 more copy constructor are called as the vertex from previous vector are copied over to the new vector array.
Optimization Strategy 1: (4:20)
--> Construct the vertex in the same place as it will be stored. Rather than constructing it in the current method and then copying it to the vertex location.
--> use emplace_back rather than push_back and only pass the parameter list for the constructor
Ex. vertices.emplace_back(1, 2, 3);
Optimization Strategy 2: (5:55)
-->Remember to “know your environment”.
--> If you know that you will need an array to store 3 vertex objects, define it as such so that it is not dynamically resized everytime it runs out of space.
Ex. First define the vector, then use vertices.reserve(3); (on separate lines)
I really like C++. Many languages will abstract such things away, which can be very useful but I very often come back to C++ because I want that level of control.
I'd like to see a language that aims to do what C++ does but without keeping all the old C legacy stuff. Clean the language up a bit and make it less of a mess. Some of the standard library features should be language features and vice versa... and we'd have a really solid language for developing high performance applications using modern abstraction. Which is what C++ wanted to be from the beginning but got held down with trying to keep compatibility with C.
@@nexusclarum8000 You really should take a look at Rust. Just as close to the metal as C/C++ without all the baggage and some really good approaches to classic problems. It takes some getting used to, especially if you aren't over OOP yet, but I it's well worth it IMHO.
JK W no, that is not C#. C# is managed and abstracts everything, you have no control over anything.
@@nexusclarum8000 well there is a project held by MIT researchers that aims at exactly what you're talking about. It's called rust
@@nexusclarum8000 swift?
This is gold for embedded systems!! Thank you so much!
I think the biggest takeaway here is that you should ALWAYS try to reserve before you start pushing back elements into the container. Even if using emplace_back, the elements would be copied over in the case of a reallocation. So always call reserve before starting to push back. Even if you don't know exactly how many elements you will push back, a guesstimate is always better than not calling reserve at all.
The std::vector also has a built in memory allocator that you can implement to get even more optimization. For instance, creating objects which are automatically aligned on the native CPUs cache line.
For CPU intensive work, ensuring objects are cache aligned can gain HUGE benefits because the CPU can fetch data for processing more quickly if that data respects the cache line.
So yeah, std::vector has plenty of nice features!
Probably late, but do you have any examples of how I could try to ensure cache optimization? Or sources to reference/learn?
I've done a very simple test a long time ago and saw the time to complete the test reduce extremely thanks to a basic cache optimization, but not sure how to deal with more complex situations.
Thanks, good to know!
Thanks. Helpful. Scott Meyer further elaborates by exploring when it's reasonable to expect emplacement to be better than insertion in his talk: "Scott Meyers - The evolving search for effective C++ - Keynote @ Meeting C++ 2014", at around the 20 minute mark.
In fact the Cherno forgot an important point. Until 03:07, the Vertex pushed back in the container is an rvalue, and therefore will be moved (not copied), since the move constructor is implicitly defined.
Upon redefining the copy constructor however, the implicit move constructor is deleted by the compiler.
By wanting to see where the copy constructor is called, we cause our code to copy the Vertex instead of moving it(until we redefine the move constructor ourselves).
Clear, precise, and super useful! Thanks, Cherno
It's amazing to see when cpu copies the members in the memory. And seeing that we reduce the amount of times we need to copy, it is soo nice.
This was very informative. Awesome. I'd love to watch more optimization videos for general and often used c++ stdlib functions.
What a great video! Loved the way you showed how to optimize a vectors memory footprint. These are the types of details that I have not yet seen in any other c++ videos on TH-cam. Wonderful job. Thanks a lot!
This rules. Thank you for showing clearly how you can look for ways to optimize and for giving such an easy to follow example, and thank you for all your videos in general. Trying really hard to learn C++ on my own and your content is more helpful than the MIT classes that are online. It's easy to follow, easy to digest, assumes that the viewer is competent, isn't condescending at all to people who are new to C++, all while explaining everything clearly.
Super cool! I hope there is a lot more stuff like this later on in this series. Thanks Cherno.
Coolest video so far. Knowing under the hood is really helpful. Thanks Cherno!
This is really great! Thank you! Can you please make additional optimization videos? Possibly even how to write a class with optimization in mind.
This was BY FAR one of the most awesome things I've learned this month regarding std lib optimization in C++! You're a LEGEND
Good optimisations and clear show of how it helps in running fast, keep going.
Great, mate!
I asked my uni professor the difference between push_back and emplace_back and he told me they are basically the same, and didn't explain any further. Thank you Cherno
That was beautiful, thanks!
I love C/C++ , they r just mindblowing
My man your channel is a blessing from c++ gods. My c++ code runs like a rocket now thx bro
The push_back() method actually has O(1) time complexity. For larger sizes of the array, it will always double the size, or make it proportionally larger by some constant multiplier. So even if you can't reserve the space beforehand, it won't be as bad an issue, it won't cause any exponential time complexity. (It would, if it resized the vector before every insertion.)
I'll make an echo of something that you already said...
Get this into your head.
YOU ARE THE BEEESSSTTTT.
With you, everything is like a walk in the park.
Thanks
You're the best C++ teacher I've found in my life!
this is better than my uni programming course by far.
La mejor serie de c++ que vi :)
You go to details bro.
Thank you so much.
I like the speed of these talks. I put them in the background and set the speed to about 1.5 or 1.25, but once in a while when it gets to something that causes a double take or that I want to pay specific attention to, I'll back it up a short bit and set the speed to normal. So for people who aren't already familiar, I bet it's about the right speed.
I am a C# programmer trying to understand C++ and it seems Lists in C# are exactly vectors as I understand it from this video. Cool!
Warning C4244 ('argument' conversion from _Ty to float possible loss of data).
changed ->
vertices.emplace_back(1, 2, 3);
to ->
vertices.emplace_back(1.0f, 2.0f, 3.0f);
I got the same thing. Didn't remember that I needed .0 at the end of my values for it to register my f properly. Thanks for your help
Sick video - thanks Cherno!
This was EXACTLY what I needed right now.
Am using a global vector where each element is HUGE - in GBs. I was initializing this vector through resize (size, def_val). def_val being huge messes up with the stack. The reserve and emplace_back will help me save copy cost and help me create elements on the heap.
If you need even more control in some resource rich situation, check out std::allocator in the header. It is what vector and all other containers use to manipulate their storage, but you can get an even finer grain of control by using it yourself.
This C++ series is so great. Its free, to point and easy to understand.
My favorite part of this guy's videos : ) (9:26)
you're really good at this!
bro... that's PURE *GOLD*
As Ilari Vallivaara has said, the time of copying N elements in a vector is no where near exponential. It's linear. You will find that if you do the amortized analysis which he has explained in his comments. I would like to give a practical proof for it.
Lets see how much time it takes to push 100M elements in a vector without reserving the space versus reserving the space.
First, without reserving the space.
#include
#include
using namespace std;
#define MAX 100000000
int main()
{
vector subject;
for(int i=1; i
Now try to do the same with a proper struct/class. It's misleading if you just use int ... the difference will be much bigger.
In my personal test in which I compared the usage of reserve() and resize(), it turned out that reserve() was indeed slower than resize() and also I had overloaded the new operator and from there I came to know that the reserve() method was allocating twice on the heap whereas the resize() method hit the heap only once. Why is this so? The compiler I used was g++ (MinGW) and compiled using std=c++17.
I love your tutorials :d you explain it perfectly. helped me a lot. i plan to watch the whole series to understand cpp better.
7:36 I don't think the number of copies is going to be exponential, the formula is (n^2 + n) / 2 actually.
Cool :) Thank you for the video, that actually was really helpful, learned something new.
Thank you so much for the helpful video!
clear as crystal
Nice video. A couple of really easy and effective tips there. Thanks.
Best C++ tutorial video ever!
your hair is goals 😂
This is the best explanation I have seen! I have also realized that watching videos teaches (me) much better than books.
One of the reason I don't like Bjarne Stroustrup's books, because it's like he's talking an alien language.
Dude I don’t know why but read countless books but I learned so much more from how you explain things great job
Very nice taste of optimizations to come. Can't wait for the game engine series!
Personal Notes:
- vertices.reserve(44) allocates memory that can hold 44 vertex object
-vertices.push_back((1,0,0) and vertices.push_back(Vertex(1,0,0)) does unnecessary copying(and creating) of Vertex object, use vertices.emplace_back(1,0,0) instead not to copy unnecessarily.
-I am bored and hurried, don’t trust 100% what I just said
This is by far the best cpp tutorial video, made by Cherno.
as clean as a whistle!!
Man this was great. Thanks dude.
Absolutely superb!!
Additionally you talk like Brüno...
This must be the best c++ series in all German spoken countries except Germany!!
incredible video, it s really teaching a lot!
thenks
This video is gold! More optimization videos pls
Your videos are gold!
i like to compare c++ versus c# to a manual transmission versus an automatic. With great power comes great responsibility, and great reward!
that was fantastic
Great! Thanks!
This is gold.
How we can optimize something like std::vector values ?
never seen this detail anywhere. sweet.
So is emplace_back now always the better option to push_back?
Fun thing that cherno does not mention is that if you set the copy constructor = to delete the code at the end of the video wont compile
but why?
Turns out emplace back might copy (not always) and so anything you pass into emplace back MUST have a copy constructor (just in case)
the only fix I am aware of (that still performs 0 copies) is having a vector of unique ptr's to vertexes instead of a vectors of vertexes
So much quality in this video
Thanks!!
I thought this was going to be a look at how to write a good allocator for a vector, rather than just standard stuff everyone already knows. Writing an allocator for a vector to handle all of the cases it has to handle isn't trivial, this content is.
The best tutorials ever. Thank you!
Outstanding
When we use emplace_back, does the vector object get constructed in heap?
Thanks
Thanks man!!!! This is one of the best...
Why is it 6 then? If the first copy is from copying from the method to the vector, the second and third is from the copying of the second one and the extension of the length. Then wouldn't every subsequent one also just add two to the amount of copies.
I just added two more pushes (vertex(10,11,12) and vertex(13,14,15)) and these are the results when increasing the reserve value:
reserve(1): 12 Copies
reserve(2): 11 Copies
reserve(3): 8 Copies
reserve(4): 9 Copies
reserve(5): 5 Copies
Why is this happening?
I have the same question , did u get any answer for that?
Wow, I had no idea that it worked like this!
superb!
Great video thx Cherno, but I got one question, how did we get 6 copies ?
First copy because the instance was in the main memory then we had to copied to the heap or the allocate memory from the vector
The second and third copies because the main and the resize
the fourth ,fifth for same reasons, how about the six copy ?where did it came from? thanks for anyone in advanced whom answered me.
did you got your answer?
@aziz aziz
Even am wondering the same,
3 for 3 object copy from stack to heap.
2 for extending the heap memory.
Not sure about one more 😐
It's easier to think on the push_backs.
First push back -> 1 copy from main (and now you have 1 item in the list)
Second push back -> 1 copy from main, 1 copy of the previous item (now you have 2 in the list)
Third push back -> 1 copy from main, 2 copies because you have 2 items on the list (now you have 3)
Until here, you have 6 copies with no optimization was made
On one future forth push back -> 1 copy from main, 3 copies
fifth push -> 1 copy from main, 4 copies
Wow.. please make more of this kind of optimization videos and I'll make sure that whole of my class in tel aviv university computer science course will be happy to subscribe to your channel!!
Im talking about the std:vector video ofc!
Technion will too :)
Such a great video!! So why shouldn't I use emplace all the time?
simple and great example!
After the capacity is 2, shouldn't after reallocation the capacity become 4. Doesn't it doubles every time? Although only 3 elements should be allocated.
The size of the vector is not doubled. At first the vector has a capacity of 1. As one copy constructor is called. After that when we add new vertex to it. The capacity is increased by one.
So both vertex are copied to the new vector list. As two copy constructor are called. When we add a third vertex a new vector array is created with 3 size and 3 more copy constructor are
called as the vertex from previous vector are copied over to the new vector array.
isn't the vector capacity increased by a multiple of 2 everytime enough space is not available??
What if you have instances of the vector class instantiated already? Surely we dont pass all their attributes individually into emplace_back
This is by far the best video I've ever seen
ily cherno
thank you. That saved for me an very important ms. speed up 2x
I've using vectors for 1 year without knowing this, Thanks Cherno
ahh.. gotta love optimization.. :) indeed so satisfying if you clean up and let your programm perform better
Don't C++ compilers automatically unroll as they optimize?
hmm could be true, don't know that actually (learned unrolling in another language though..)
Yeah, it shouldn't be needed in C++.
Please make an episode on templates as well
7:40 does it though? does it really go up exponentially? that sounds linear to me...
It's even less than linear if vector grows by a factor (instead of just adding 1 to the capacity) : p
Yep, it is 2x linear :)
If emplace_back does the same thing as push_back but better then why do we need to use push_back? I mean why does push_back even exist?
Super useful. I'm impressed with your grasp and ability to teach
that guitar is going to topple anytime now
c pillow like channel logo impressing
I'm emplaced with this c++ series.
Interesting that it seems the copies is equal to the number pushed back factorial
ty good man
In 6:11 you say, if we know "our environemnt" and planing to push three vertex objects, why not using an array with three elements instead?