This is one of the most essential videos Cherno has done, imo. It really demonstrates the fundamentals and power of C++ memory usage/manipulation. Good stuff, thanks for the “pointers”.
He didn't mention undefined behavior in this video. A lot of code here has undefined behavior. You need to walk on a thin line if you want to write code like this and avoid undefined behavior. I have read Joe Zbiciak's answers and comments on Quora about what is defined and what is undefined (at least for C, usually what is true for C is also true for C++ but here we are on a thin ice territory). If you don't know by heart what is and what isn't undefined behavior when it comes to things like this you can't guess it.
@@jakub7321 This is my favorite sub-comment, which points to the favorite comment. If anyone asks me about why C++ is so wonderful, I'll tell them about pointers and point to this sub-comment, which sill refer to the parent comment.
@@mykolatetiuk6661 I really don't want to prove something but here is a C# equivalent code. Maybe someone doesn't even know that you could do that in C# too ;-) using System; class Program { struct Entity { public int x, y; } unsafe static void Main() { Entity e = new Entity{x = 5, y = 8}; int* position = (int*)&e; int y = *(int*)((byte*)&e + 4); Console.WriteLine(position[0] + ", " + position[1]); Console.WriteLine(y); Console.ReadLine(); } }
I love how you are getting into the advanced features of C++ in such a straightforward way. These have long been shrouded in verbose mystery, and you have such a beautiful way of demystifying them in simple terms. Thank you for being an amazing teacher.
I learn much further and deeper with youe videos than I could learn during my class. Concise explanation for complicated concepts. I will review C++ with these videos and I hope I get more used to C++. Thank you very much for your work!
Before some newbie programmer does any of those pointer tricks, you should also teach them about packing because when you start mixing types inside a struct and/or switching platforms, the code will not work if the padding changes. I've been in the industry as a ASM/C/C++ programmer since the mid 80s, and to be honest other than when messing around, people don't do stuff like that in production code (unless they want to be fired).
Hey! I've started watching your videos only recently and I love the way you explain things, everything is so clear here unlike most YT channels. In one of the c++ videos (how to write a c++ class) you wrote a LOG class but at the end said "that is absolutely not how i would write a log class". If that's possible in the future, I would love to see a video or a series of videos where you write an actual short project step by step. Have a nice day and keep up the great work :)
I think beginners often get into trouble with strong types and C++ allows tricks like this which makes people try to go around the type system for no good reason in most cases. When you become more experienced you have less and safer type conversions, because the type system starts to make sense.
idk why everyone thinks that "beginners" just go around type punning shit for no reason. what do you think they're doing? you think static typing is so difficult a concept to grasp that, instead of just using the type system like they're supposed to, novice programmers just think it's a good idea to store all their scalars as an int64 and fucking type pun into them? two year old comment nigga ass.
@@loli42 Maybe someone who doesn't even know what a type pun is (like me) would try to use % on a double by first using a long long then casting that way too big long long into a double without losing a bunch of data. As far as I know, cpp doesn't give warnings for bad or unnecessary casting.
@@puppergump4117 that's not type punning, it's casting, and c++ compilers most certainly do emit warnings for implicit narrowing conversions because msvc won't shut the fuck up about it
@@loli42 I obviously know that. Which is why I explicitly said 'casting'. After casting loses double data, I try type punning. And from MY experience, there were no warnings.
A very nice video! This actually provided me a different angle to understanding pointers. When I follow along with the code I usually add comments about what Cherno said and what I thought. I hope the following exempt will help someone else get a different view on pointers as well. So, in my mind...a pointer is like a tool that specifies: 1. How many bytes to read? (int pointer reads 4 bytes, double pointer reads 8 bytes, a pointer of a struct with size 27 bytes reads 27 bytes). 2. From where should we start reading those amounts of bytes? This kinds of makes our issue clear (talking about "double value = *(double*)&a;" here). We initialized a variable of size 4 bytes (int a = 50). So 4 bytes somewhere in memory were allocated to represent the value 50. In order to read these 4 bytes, we need a 'reader' that knows to read exactly 4 bytes (int pointer), and also from where to read them. However, we constructed a 'reader' that reads 8 bytes (double pointer). The starting position of these readers are the same. However, the 8 byte reader reads 4 bytes more than was specified when we wrote int a. And so, since the other half was not specified, in addition to the 4 specified bytes, we also read unspecified bytes from there. And this will lead to us getting an unexpected result. While the end result does correctly allocate 8 bytes (the newly created double variable), the issue is that we read 4 additional bytes, which we did not specify ourselves, and thus the outcome will be random.
Try to think that when you cast pointers or a variable adress (&var) you are just saying to the compiler "Hey use this memory adress to read or write and trust me that everything will be fine". You can have 10 different vars/pointers pointing, reading and writting to that same memory space that the compiler doesn't care if you this C-Style casting. But if you do a mistake you'll have undifined behaviour or memory leaks will emerge.
@@SandFoxling it is a popular joke, what do you mean? The quake 3 fast inverse square root code is legendary. (For the comments more than the actual code)
@@totallynuts7595 I figured, I guess I'm not old enough... I have read 'masters of doom' but that's about the extent of my knowledge on the early days of ID and their legendary feats. I guess you still would have to call that joke within a specific circle to actually cast it into a popular type.
It's worth noting that type punning via arbitrary types, even if they're the same size has undefined behaviour. You can argue that it might appear to work in some situations or with optimisations turned off, but it's generally just not a good idea. Enabling optimisations, compiling to a different platform, adding a comment above the code etc... could break the code. Your initial Entity example for example violates the strict aliasing rule and isn't legal C++.
It's one of the coolest features of C and C++. Something I didn't have in other languages and IDK if I understand the term correctly, but it seems like a zero cost abstraction.
As part of my studying of computer architecture, the one thing I have learnt is that we want to copy as little as possible. If data exists anywhere in memory that is useful to different parts of a program then use pointers to point to its address location. Essentially what this is doing is allowing us to treat that data as a different type depending on the context in which it is being used and the pointer says how to interpret the data, i.e the pointer says interpret this data as a float or an integer.
9:56 this is pretty smart! actually its principle is to shift the pointer by "sizeof(&e.x)" units away from the beginning of the &e pointer. So your point is now directly pointing at the beginning of address of y, then you dereference it using int form (allocate the memory size and format) from the "address" you just got, so to get an actual reading of that value. just before you dereference it (the * on the very left) it was all very cleaver pointer address maneuver! I have learned so much from your vids!!!!!!!!!!!!!! you are damn cool!
i am pretty new to c++, but i love pointers soo much. you can do so many things with them, even void pointers, which i thought at the begining don't make any sence :P btw thanks for all the c++ vids cherno, i like that you explain what happens in the background as well, it makes things make sence ;) :D
pointers were great, but I've almost never used any pointers since c++11 and STL, because they tend to generate bugs and crash programs (if you don't handle them well). pointer-related garbage collection is such a pain.
garbage collection? i thought that was only a thing in managed languages like c# (i think that is what you call them). I don't see how garbage collection is a Thing in c++ (unless you run that managed, which i think is somehow possible), i mean that is also a big reason why i came to c++ ... so i can manage memory myself, and not worry about constructing any referance type objects (so heap allocated) in any part of my programm that gets run quite often. (so game a loop, in my case that was block placeing) it was just too annoying to deal with EDIT: removed a word "a" which was somewhere it shouldn't have been XD
yeah I think I misused the word 'garbage collection'. I don't mean that C++ has a garbage collector, only that the syntax of C++11 completely (unless you have to deal with some legacy code) makes using C's malloc(), free(), or C++98's 'new' and 'delete' unnecessary. All those works are done in the built-in containers and classes, so it pretty much behaves like a 'garbage collector' from a programmer's perspective. (not a real garbage collector thought, more like an API-level garbage collection that prevents memory leak)
Diggin it man, I'm on Ep. 7; constructing your sparky engine. It has been, thus far, all good in the hood. If there are any surprises, especially considering how much development Emscripten has undergone between then and now, I would like to (somehow) send you information of these changes, or perhaps get the 411 from you yourself on what to watch out for, ahead of time. Quality channel, rock on!
Generic memory pools to take advantage of cache locality and memory fragmentation are times when it's absolutely necessary to directly tell the compiler how to interpret a type
I cant believe I'm watching this right now lol. I spent two nights to actually come up with this same idea to solve a problem and now I find out it was actually a thing
Dodgy talking about that without mentioning strict aliasing aka what happens when you think you changed your struct values, but the compiler just updated a register. Suddenly the memory address is not the value you think it is
I noticed the memory was little-endian. I assume endianness is a factor to consider here. A 4 character string, interpreted as and int, will give you radically different results on a Linux x86 machine compared to a Linux Sparc machine. Or am I wrong?
Not always, but usually yes. ;-) Endianness is not an operating system thing, so accessing memory as int pointer to index the int struct elements should work the same on Linux x86 too and on Sparc, MIPS, etc too without worrying these. But this is only true, if you cast elements back to the types they originally was and if you work with the data in the same program only. The problem starts, when these data leave your program in any way. If you serialize an int, send the chars over a network into a different (endianness) machine, then you read it back there and cast back to an int. Then it will be backwards. This is the reason why there are htons() and nthos() functions exists for TCP/IP networking. The network byte order is standardized, and if the system you compile on is in the same byteorder then these functions do nothing, if it is different then they swap the data. Another issue is that sizeof(int) is implementation-defined. It could very probably happen that if you send ints from a PC to a 8-bit microcontroller where int could be just 8-bits then you also have a problem. Theoretically you could even have a problem, when you save data to a file, and open it on the exact same machine and OS, but with a different program, which might be compiled with a different compiler. This is why one should avoid using int, and use fixed size types like uint32_t instead. Another issue is memory alignment and structs. Some architectures, ARM for example doesn't like unaligned memory access, so an 32-bit integer should reside on 4 byte address boundary only. So if you have a struct { char, uint32_t } then you have 8 byte allocated, the first byte is a char, 3 byte padding, then 4 bytes of int. If you serialize this struct by indexing it's memory you may expect that you have 5 bytes of data, but instead you have 8, because of the padding. This is a problem when you want to map a struct to a hardware register or some kind of network data. Some compilers have a possibility to tell them to generate packed-struct to avoid padding. Non-standard, non-portable, of course. attribute((packed)) in GCC, #pragma pack in others, etc... And if the target CPU really can't do unaligned access, then the compiler will generate two aligned reads for that unaligned int then shifts the bits to assemble it for you. Performance issue... All of this can happen and this is why this considered dangerous and you should not do it unless you are develop a low-level system in a specific environment and you know exactly what are you doing.
once I get to the casting video I feel I will want to watch the series over again just to understand the fifty occasions that casting has already come up :P
I see a few very special uses, so I consider it very useful, though I first was introduced to the idea using unions. I often associate type punning with things like freelists, and have suspect its important for how some libraries get around endianess to use their own byte orders. I've also used like this to change a function for hash strings into one that hashes any data type (though preferable none that include pointers). Very useful, though definitely not something you just do everyday.
Hi! You could do a tut' about memory pools and create simple class for that. In the end you could give some advices to make a advanced pool or even make a new video about the advanced memory pool. I think, with your explanation we could go on a next level of efficient game programing.
This video is missing a big red neon warning at the very beginning that most things described in it are actually undefined behavior and should never be done. This code *will* break and cause a lot of grief.
Aliasing an int as a double is stupid because of the different size. And the only way to to legal aliasing in C++ is memcpy. And you can use bit_cast to convert the bit-representation of a funamental variable.
This demonstrates perfectly why C/C++ are so powerful, but at the same time there are so many vulnerabilities in code written in these languages. Doing this can be perfectly safe, but at the same time it opens the door for all kinds of problems. Just like casting a struct to an array and then accessing an index that isn't part of the struct's memory.
Yeah do this type (pun intended) of thing all of the time in C. Normal fare really. I get that a lot of "modern programmers" arf over that notion but there are times when it is useful. Like with arguments over which language is best, I find such criticism to shed more heat than light. I digress. Good video.
This is undefined behavior btw. You should be very careful with this because different compilers will act in different ways. You can run into issues relating to lifetimes and alignment.
The whole experience of programming in C can be summarized as undefined behaviour. As far as I remember alignment isn't really an issue, of course it would be if you wrote the code that was demonstrated, but there are easy ways to deal with alignment if you actually want to dive into this unique flavour of BDSM.
@@OFfic3R1K The best way to legally type pun in C++ is to use memcpy (which is legal since a new object is created. This means the strict aliasing rules aren't violated). We do get std::bit_cast though which will allow for type punning casts and it can be used in constexpr situations.
@@lincolnsand5127 @OFfic3RiK Great discussion! As a newbie I have a question. If our code is absolutely free of alignment and Endian problems, could type punning still cause unexpected errors? I saw a stack overflow example claiming that, when compiler optimization is in place, under strict aliasing rule, the compiler might assume two pointers of different types point to different places, and thus decide the writes via the 1st pointer do not affect the 2nd pointer and its pointed value, so it generates machine code which may not register-load the 2nd pointed value in time, and thus lead to catastrophe. However some other people were saying the post was wrong. Do you think this as an legitimate issue?
@@wusuoliu5431 Well. It depends. gcc and clang have a flag called -fno-strict-aliasing. If you don't use that flag, compiler optimizations might break your code. If you have the flag, *only* hardware-specific things will (e.g. some architectures don't allow unaligned loads).
This is undefined behavior. It violates the strict aliasing rules. This is only allowed if you are punning to std::byte, char, unsinged char, or a type "similiar" to the one you are punning from. Use memcpy instead. It'll get optimized away so don't worry about overhead.
5:20 - the same code for C#: using System; class Program { unsafe static void Main() { int a = 50; double value = *(double*)&a; Console.WriteLine(value); Console.ReadLine(); } } The memory value of "a" is: 32 00 00 00 The memory value of "value" is: 32 00 00 00 00 00 00 00
This is very interesting and a nice demonstration of 'C++ is no magic', but at the same time you should probably make it clearer that this should NOT be used in projects by anyone but experts, and even these should think like 4 times about it. In most cases the possible performance penalty is probably to be preferred over possible insecurities and other problems that this stuff can cause. Theres some situations I might think of using this, i.e. to convert a 3 element array (interpreted as 3D vector) to an interpreting struct {double x,y,z} but TBH I miss experience to try to do this in any but a POC. Edit: So basically the Unions in the very next video. Neat, was this video meant mostly as a explanation?
I love your video's. Have been sort of binge watching the C++ series. But on this video you kinda lost me. Looking at this code feels so dirty :p I don't immediately see the practical application in a real codebase. Seems like it could be the source of bugs if you're not careful. Looking forward to the casting video's however!
You can do the same thing kind of in JavaScript with Objects. Every Object is really a sparse array so you can just pull stuff out of the object as if it was an array.
After all these years I'm still not sure about how I feel about C-style casts contra static_cast, dynamic_cast, const_cast and reinterpret_cast in C++ code. Since the C style cast is not very clear about which type of cast will occur, since it depends on the data types, I feel the more verbose C++ style casts helps when I read code that isn't mine. Because, and correct me if I'm wrong. The C style cast actually tries the various C++-style casts in order, until it finds one that works.
Now i dont know at the moment what i would do with that but i feellike i can write some troll code accessing everything by casting to a charpointer and moving it around
Watch out for the type punning police! They really don't like it when you break out these tools. Did you know that some compilers require special options to be deactivated in order to even do these things in c++? Some circles are trying to dumb down the language and put us in a safety box, but I'm totally in support of having such tools. We obviously need to be responsible with them, but they are absolutely necessary to do certain things efficiently. For instance, good luck designing a memory pool without using type punning. Or some type of array/allocator that manually calls constructors as they are needed.
Could you technically use this to modify private class variables you wouldn't have access to normally? I guess I could test it out but it's an interesting thought.
Can't we just write float value = *(float*)&a; That way we are accessing the starting point in memory of variable a, and telling the compiler to treat that memory address as a place which holds float (which means next 4 bytes belong to the variable called value, just like with integer a, which means that we are not reading the memory that we didn't want to read).
The messier it gets, the more I like it. It could/must/would be use to desirealiz objects and stream them in file or over connection... Just thinking.🤔
Hi Cherno, first thing your video are very good but speed is little bit faster. If we work on 32bit OS, and we want to save an address of variable then we need 4bytes i.e char c; char *ptr=&c; so to save the address of char c we need at least 4 byte right? now my question is as per C++ standard empty class size is 1byte due to differentiating via address. So how compile save empty size struct or class memory into 1 byte.
This must be the "scary basement" of C++.
Type punning is real fun to mess around with.
"I guess I am pretty sane"
I have my doubts
Remember that coffee intro? :P
This is one of the most essential videos Cherno has done, imo. It really demonstrates the fundamentals and power of C++ memory usage/manipulation. Good stuff, thanks for the “pointers”.
He didn't mention undefined behavior in this video. A lot of code here has undefined behavior. You need to walk on a thin line if you want to write code like this and avoid undefined behavior. I have read Joe Zbiciak's answers and comments on Quora about what is defined and what is undefined (at least for C, usually what is true for C is also true for C++ but here we are on a thin ice territory). If you don't know by heart what is and what isn't undefined behavior when it comes to things like this you can't guess it.
Structure padding / member alignment, definitely needed a mention here.
This technique is used in Quake's notorious fast inverse square root function. It converts a float into a long in order to use bit manipulation.
I finally understood better what was done in that Quake video
But Carmack said "SIMD is not a useful thing", and now we use glm instead of bitwise magic. :D
Carmack is a genius
@@magellan124 carmack didn't invent it, he found out about it from someone else
@@softed who?
This is my favorite episode in the series. If someone asks me how can programming make someone feel powerful, I'm going to refer them here.
I do this regularly and I know exactly what you mean!
This is my favourite comment. If someone asks me if I have seen a good comment before, I'm going to refer them here.
@@jakub7321 This is my favorite sub-comment, which points to the favorite comment. If anyone asks me about why C++ is so wonderful, I'll tell them about pointers and point to this sub-comment, which sill refer to the parent comment.
@@mykolatetiuk6661 I really don't want to prove something but here is a C# equivalent code. Maybe someone doesn't even know that you could do that in C# too ;-)
using System;
class Program
{
struct Entity
{
public int x, y;
}
unsafe static void Main()
{
Entity e = new Entity{x = 5, y = 8};
int* position = (int*)&e;
int y = *(int*)((byte*)&e + 4);
Console.WriteLine(position[0] + ", " + position[1]);
Console.WriteLine(y);
Console.ReadLine();
}
}
I love how you are getting into the advanced features of C++ in such a straightforward way. These have long been shrouded in verbose mystery, and you have such a beautiful way of demystifying them in simple terms. Thank you for being an amazing teacher.
The way you passed the ints as an array blew my mind
I love type punning network packets, it allows me to overlay a struct to parse out its members (without the padding of course).
I learn much further and deeper with youe videos than I could learn during my class.
Concise explanation for complicated concepts.
I will review C++ with these videos and I hope I get more used to C++.
Thank you very much for your work!
Before some newbie programmer does any of those pointer tricks, you should also teach them about packing because when you start mixing types inside a struct and/or switching platforms, the code will not work if the padding changes. I've been in the industry as a ASM/C/C++ programmer since the mid 80s, and to be honest other than when messing around, people don't do stuff like that in production code (unless they want to be fired).
Hey!
I've started watching your videos only recently and I love the way you explain things, everything is so clear here unlike most YT channels.
In one of the c++ videos (how to write a c++ class) you wrote a LOG class but at the end said "that is absolutely not how i would write a log class". If that's possible in the future, I would love to see a video or a series of videos where you write an actual short project step by step.
Have a nice day and keep up the great work :)
That int y example cracked me up. Great video!
All your videos are super useful, they have just the right length, neither too short, nor too long
I think beginners often get into trouble with strong types and C++ allows tricks like this which makes people try to go around the type system for no good reason in most cases. When you become more experienced you have less and safer type conversions, because the type system starts to make sense.
idk why everyone thinks that "beginners" just go around type punning shit for no reason. what do you think they're doing? you think static typing is so difficult a concept to grasp that, instead of just using the type system like they're supposed to, novice programmers just think it's a good idea to store all their scalars as an int64 and fucking type pun into them? two year old comment nigga ass.
wrggg
@@loli42 Maybe someone who doesn't even know what a type pun is (like me) would try to use % on a double by first using a long long then casting that way too big long long into a double without losing a bunch of data. As far as I know, cpp doesn't give warnings for bad or unnecessary casting.
@@puppergump4117 that's not type punning, it's casting, and c++ compilers most certainly do emit warnings for implicit narrowing conversions because msvc won't shut the fuck up about it
@@loli42 I obviously know that. Which is why I explicitly said 'casting'. After casting loses double data, I try type punning. And from MY experience, there were no warnings.
mind blown, made my love for C++ increase even more!!
A very nice video! This actually provided me a different angle to understanding pointers.
When I follow along with the code I usually add comments about what Cherno said and what I thought.
I hope the following exempt will help someone else get a different view on pointers as well.
So, in my mind...a pointer is like a tool that specifies:
1. How many bytes to read? (int pointer reads 4 bytes, double pointer reads 8 bytes, a pointer of a struct with size 27 bytes reads 27 bytes).
2. From where should we start reading those amounts of bytes?
This kinds of makes our issue clear (talking about "double value = *(double*)&a;" here). We initialized a variable of size 4 bytes (int a = 50). So 4 bytes somewhere in memory were allocated to represent the value 50. In order to read these 4 bytes, we need a 'reader' that knows to read exactly 4 bytes (int pointer), and also from where to read them. However, we constructed a 'reader' that reads 8 bytes (double pointer). The starting position of these readers are the same. However, the 8 byte reader reads 4 bytes more than was specified when we wrote int a. And so, since the other half was not specified, in addition to the 4 specified bytes, we also read unspecified bytes from there. And this will lead to us getting an unexpected result. While the end result does correctly allocate 8 bytes (the newly created double variable), the issue is that we read 4 additional bytes, which we did not specify ourselves, and thus the outcome will be random.
Try to think that when you cast pointers or a variable adress (&var) you are just saying to the compiler "Hey use this memory adress to read or write and trust me that everything will be fine". You can have 10 different vars/pointers pointing, reading and writting to that same memory space that the compiler doesn't care if you this C-Style casting. But if you do a mistake you'll have undifined behaviour or memory leaks will emerge.
The "double value = *(double*)&a;" reminded me of
"i = * ( long * ) &y; // evil floating point bit level hacking"
from Quake III Arena source code.
Both are not legal ways to type pun in C or C++. So. They are at least similar in that way.
You refer to it like its some popular joke not some random line of code burried within a game's source code... I love it XD
@@SandFoxling it is a popular joke, what do you mean? The quake 3 fast inverse square root code is legendary. (For the comments more than the actual code)
@@totallynuts7595 I figured, I guess I'm not old enough... I have read 'masters of doom' but that's about the extent of my knowledge on the early days of ID and their legendary feats.
I guess you still would have to call that joke within a specific circle to actually cast it into a popular type.
@@SandFoxling i'm not that old either, compared to the games. I just like Doom and Quake
This is actually a powerful manipulation from C. As a C programmer learning C++, I am super satisfied Cherno covers it.
This is awesome, more indept videos like this would be very nice! Good work!
It's worth noting that type punning via arbitrary types, even if they're the same size has undefined behaviour. You can argue that it might appear to work in some situations or with optimisations turned off, but it's generally just not a good idea. Enabling optimisations, compiling to a different platform, adding a comment above the code etc... could break the code. Your initial Entity example for example violates the strict aliasing rule and isn't legal C++.
all the examples are UB
thanks for the warning!
It's one of the coolest features of C and C++. Something I didn't have in other languages and IDK if I understand the term correctly, but it seems like a zero cost abstraction.
As part of my studying of computer architecture, the one thing I have learnt is that we want to copy as little as possible. If data exists anywhere in memory that is useful to different parts of a program then use pointers to point to its address location. Essentially what this is doing is allowing us to treat that data as a different type depending on the context in which it is being used and the pointer says how to interpret the data, i.e the pointer says interpret this data as a float or an integer.
9:56 this is pretty smart! actually its principle is to shift the pointer by "sizeof(&e.x)" units away from the beginning of the &e pointer. So your point is now directly pointing at the beginning of address of y, then you dereference it using int form (allocate the memory size and format) from the "address" you just got, so to get an actual reading of that value.
just before you dereference it (the * on the very left) it was all very cleaver pointer address maneuver! I have learned so much from your vids!!!!!!!!!!!!!! you are damn cool!
Hey Cherno, It would be really cool if you covered the subject of unions in a video.
i am pretty new to c++, but i love pointers soo much. you can do so many things with them, even void pointers, which i thought at the begining don't make any sence :P btw thanks for all the c++ vids cherno, i like that you explain what happens in the background as well, it makes things make sence ;) :D
What are you doing on this video if you're new to C++? Go watch older videos lol
So in your opinion if you watched all of Cherno's videos you are no longer a beginner at C++?
pointers were great, but I've almost never used any pointers since c++11 and STL, because they tend to generate bugs and crash programs (if you don't handle them well). pointer-related garbage collection is such a pain.
garbage collection? i thought that was only a thing in managed languages like c# (i think that is what you call them). I don't see how garbage collection is a Thing in c++ (unless you run that managed, which i think is somehow possible), i mean that is also a big reason why i came to c++ ... so i can manage memory myself, and not worry about constructing any referance type objects (so heap allocated) in any part of my programm that gets run quite often. (so game a loop, in my case that was block placeing)
it was just too annoying to deal with
EDIT: removed a word "a" which was somewhere it shouldn't have been XD
yeah I think I misused the word 'garbage collection'. I don't mean that C++ has a garbage collector, only that the syntax of C++11 completely (unless you have to deal with some legacy code) makes using C's malloc(), free(), or C++98's 'new' and 'delete' unnecessary. All those works are done in the built-in containers and classes, so it pretty much behaves like a 'garbage collector' from a programmer's perspective. (not a real garbage collector thought, more like an API-level garbage collection that prevents memory leak)
Diggin it man, I'm on Ep. 7; constructing your sparky engine. It has been, thus far, all good in the hood. If there are any surprises, especially considering how much development Emscripten has undergone between then and now, I would like to (somehow) send you information of these changes, or perhaps get the 411 from you yourself on what to watch out for, ahead of time. Quality channel, rock on!
Generic memory pools to take advantage of cache locality and memory fragmentation are times when it's absolutely necessary to directly tell the compiler how to interpret a type
Finally, a feature which cherno actually uses!
I cant believe I'm watching this right now lol.
I spent two nights to actually come up with this same idea to solve a problem and now I find out it was actually a thing
It is super useful for things like deserializing bytes from a source (file, internet stream, etc.)
Absolutely incredible content..
Dodgy talking about that without mentioning strict aliasing aka what happens when you think you changed your struct values, but the compiler just updated a register. Suddenly the memory address is not the value you think it is
My phone screen is too tiny to appreciate the code.
This is cool. It's kind of deeper than casting.
Thanks The Cherno for your cool video!
I noticed the memory was little-endian. I assume endianness is a factor to consider here. A 4 character string, interpreted as and int, will give you radically different results on a Linux x86 machine compared to a Linux Sparc machine. Or am I wrong?
Not always, but usually yes. ;-)
Endianness is not an operating system thing, so accessing memory as int pointer to index the int struct elements should work the same on Linux x86 too and on Sparc, MIPS, etc too without worrying these.
But this is only true, if you cast elements back to the types they originally was and if you work with the data in the same program only.
The problem starts, when these data leave your program in any way.
If you serialize an int, send the chars over a network into a different (endianness) machine, then you read it back there and cast back to an int. Then it will be backwards. This is the reason why there are htons() and nthos() functions exists for TCP/IP networking. The network byte order is standardized, and if the system you compile on is in the same byteorder then these functions do nothing, if it is different then they swap the data.
Another issue is that sizeof(int) is implementation-defined. It could very probably happen that if you send ints from a PC to a 8-bit microcontroller where int could be just 8-bits then you also have a problem. Theoretically you could even have a problem, when you save data to a file, and open it on the exact same machine and OS, but with a different program, which might be compiled with a different compiler. This is why one should avoid using int, and use fixed size types like uint32_t instead.
Another issue is memory alignment and structs. Some architectures, ARM for example doesn't like unaligned memory access, so an 32-bit integer should reside on 4 byte address boundary only.
So if you have a struct { char, uint32_t } then you have 8 byte allocated, the first byte is a char, 3 byte padding, then 4 bytes of int.
If you serialize this struct by indexing it's memory you may expect that you have 5 bytes of data, but instead you have 8, because of the padding.
This is a problem when you want to map a struct to a hardware register or some kind of network data.
Some compilers have a possibility to tell them to generate packed-struct to avoid padding. Non-standard, non-portable, of course. attribute((packed)) in GCC, #pragma pack in others, etc...
And if the target CPU really can't do unaligned access, then the compiler will generate two aligned reads for that unaligned int then shifts the bits to assemble it for you. Performance issue...
All of this can happen and this is why this considered dangerous and you should not do it unless you are develop a low-level system in a specific environment and you know exactly what are you doing.
Yeah, this is not portable!
@@gabrielbarrantes6946It is portable if you convert it to network byte order before you work on it then convert it back.
I clicked for C++ puns.. utterly disappointed :|
In programming you have to construct everything yourself, I thought I knew that... Unless of course there's a preconstructed library.
Variables can have multiple "meanings" depending on the datatypes they are cast to.
Puns, you see?
I think you mean, utterly "disappointered" ;)
Sorry...
*insert puns here*
I should of asked for her namespace because now i have a STD
Jp Silver epic 😂😂
Type puns here
@Jp Silver , I *c* what you did there. 😁
xD xD
this is the super important topic. I studied it initially from Jerry Cain's lectures @ stanford....
Really useful to know - thank you.
I love type punning really!.
thanks!
Thanks for this video!
you didn't mention "offsetof"
it's a really useful function for type punning.
At some point in the next 10 years, I'm sure I'll finally "get" this video...
once I get to the casting video I feel I will want to watch the series over again just to understand the fifty occasions that casting has already come up :P
[11:16] This is one way C++ blows my mind.
keep the good work, thx for everything
I see a few very special uses, so I consider it very useful, though I first was introduced to the idea using unions.
I often associate type punning with things like freelists, and have suspect its important for how some libraries get around endianess to use their own byte orders. I've also used like this to change a function for hash strings into one that hashes any data type (though preferable none that include pointers). Very useful, though definitely not something you just do everyday.
Hi! You could do a tut' about memory pools and create simple class for that. In the end you could give some advices to make a advanced pool or even make a new video about the advanced memory pool. I think, with your explanation we could go on a next level of efficient game programing.
I am so thankful to you just because of this video..........thanks CHERNO
I would like to see some file input and output...great and thorough videos love to see more
This video is missing a big red neon warning at the very beginning that most things described in it are actually undefined behavior and should never be done. This code *will* break and cause a lot of grief.
Aliasing an int as a double is stupid because of the different size. And the only way to to legal aliasing in C++ is memcpy. And you can use bit_cast to convert the bit-representation of a funamental variable.
I think a video about exceptions wouldn't be so bad to have either
This demonstrates perfectly why C/C++ are so powerful, but at the same time there are so many vulnerabilities in code written in these languages. Doing this can be perfectly safe, but at the same time it opens the door for all kinds of problems. Just like casting a struct to an array and then accessing an index that isn't part of the struct's memory.
Yeah do this type (pun intended) of thing all of the time in C. Normal fare really. I get that a lot of "modern programmers" arf over that notion but there are times when it is useful. Like with arguments over which language is best, I find such criticism to shed more heat than light. I digress. Good video.
This is undefined behavior btw. You should be very careful with this because different compilers will act in different ways. You can run into issues relating to lifetimes and alignment.
The whole experience of programming in C can be summarized as undefined behaviour. As far as I remember alignment isn't really an issue, of course it would be if you wrote the code that was demonstrated, but there are easy ways to deal with alignment if you actually want to dive into this unique flavour of BDSM.
@@OFfic3R1K The best way to legally type pun in C++ is to use memcpy (which is legal since a new object is created. This means the strict aliasing rules aren't violated). We do get std::bit_cast though which will allow for type punning casts and it can be used in constexpr situations.
@@lincolnsand5127 @OFfic3RiK Great discussion! As a newbie I have a question. If our code is absolutely free of alignment and Endian problems, could type punning still cause unexpected errors? I saw a stack overflow example claiming that, when compiler optimization is in place, under strict aliasing rule, the compiler might assume two pointers of different types point to different places, and thus decide the writes via the 1st pointer do not affect the 2nd pointer and its pointed value, so it generates machine code which may not register-load the 2nd pointed value in time, and thus lead to catastrophe. However some other people were saying the post was wrong. Do you think this as an legitimate issue?
@@wusuoliu5431 Well. It depends. gcc and clang have a flag called -fno-strict-aliasing. If you don't use that flag, compiler optimizations might break your code. If you have the flag, *only* hardware-specific things will (e.g. some architectures don't allow unaligned loads).
Really informative, thanks!
My favorite instance of this is The fast-inverse square root
This is undefined behavior. It violates the strict aliasing rules. This is only allowed if you are punning to std::byte, char, unsinged char, or a type "similiar" to the one you are punning from. Use memcpy instead. It'll get optimized away so don't worry about overhead.
wdym by similar
This demostrates the amazing power of knowing C++, also, I guess that we could use this capability to achieve something like Arena Allocation, right?
5:20 - the same code for C#:
using System;
class Program
{
unsafe static void Main()
{
int a = 50;
double value = *(double*)&a;
Console.WriteLine(value);
Console.ReadLine();
}
}
The memory value of "a" is: 32 00 00 00
The memory value of "value" is: 32 00 00 00 00 00 00 00
With the same potential of a crash because you might access memory that doesn't exist.
Actually structs can have some padding, if the types don't aling nicely, compiler can add some padding to improve performance...
Afaik converting structs to arrays isn't always safe, because og alignment.
Love your videos. Could you make one on how to package and distribute a c++ app soon?
int and double are not promised to be address aligned, I think that punning double->int will always work but not the other way around
Elegancko
This is very interesting and a nice demonstration of 'C++ is no magic', but at the same time you should probably make it clearer that this should NOT be used in projects by anyone but experts, and even these should think like 4 times about it. In most cases the possible performance penalty is probably to be preferred over possible insecurities and other problems that this stuff can cause.
Theres some situations I might think of using this, i.e. to convert a 3 element array (interpreted as 3D vector) to an interpreting struct {double x,y,z} but TBH I miss experience to try to do this in any but a POC.
Edit: So basically the Unions in the very next video. Neat, was this video meant mostly as a explanation?
I love your video's. Have been sort of binge watching the C++ series. But on this video you kinda lost me. Looking at this code feels so dirty :p I don't immediately see the practical application in a real codebase. Seems like it could be the source of bugs if you're not careful. Looking forward to the casting video's however!
This is insane, but awesome!
That hand waiving...
pro tip
.
.
.
.
.
.
.
.
.
.
Never skip an ad on Cherno's videos
you are the best!
"Memory is by far one of the biggest things we have to deal with when we're actually programming" - Yan Chernikov
I think you will like Typescript. It's amazing
You can do the same thing kind of in JavaScript with Objects. Every Object is really a sparse array so you can just pull stuff out of the object as if it was an array.
After all these years I'm still not sure about how I feel about C-style casts contra static_cast, dynamic_cast, const_cast and reinterpret_cast in C++ code. Since the C style cast is not very clear about which type of cast will occur, since it depends on the data types, I feel the more verbose C++ style casts helps when I read code that isn't mine. Because, and correct me if I'm wrong. The C style cast actually tries the various C++-style casts in order, until it finds one that works.
still i agree w/u
you haven't mentioned that in some cases compiler might add padding between between x and y in the struct.
Now i dont know at the moment what i would do with that but i feellike i can write some troll code accessing everything by casting to a charpointer and moving it around
when it clicked, I was like "WOW Damn that's amazing :)"
Watch out for the type punning police! They really don't like it when you break out these tools. Did you know that some compilers require special options to be deactivated in order to even do these things in c++? Some circles are trying to dumb down the language and put us in a safety box, but I'm totally in support of having such tools. We obviously need to be responsible with them, but they are absolutely necessary to do certain things efficiently. For instance, good luck designing a memory pool without using type punning. Or some type of array/allocator that manually calls constructors as they are needed.
Can compiler choose how organize this in memory and brake it?
Could you technically use this to modify private class variables you wouldn't have access to normally? I guess I could test it out but it's an interesting thought.
Can't we just write
float value = *(float*)&a;
That way we are accessing the starting point in memory of variable a, and telling the compiler to treat that memory address as a place which holds float (which means next 4 bytes belong to the variable called value, just like with integer a, which means that we are not reading the memory that we didn't want to read).
Also you can use unions for treating same memory as different types
lmao I had no idea you could check out memory like that in the VS debugger, though it makes total sense
everybody in kernel development does this
I love C++ because freedom.
Not sure, but I think you could use UNION for same purpose, and avoid getting out of memmory range for a structure :?
I think it's technically undefined behavior or something, but in practice yeah it does work like that on most compilers.
auto pun = "yes"
The messier it gets, the more I like it.
It could/must/would be use to desirealiz objects and stream them in file or over connection... Just thinking.🤔
Don't be scared if you don't understand, keep learning C++ and with time you'll comeback and all of this will make sense and be easy.
Hi Cherno,
first thing your video are very good but speed is little bit faster.
If we work on 32bit OS, and we want to save an address of variable then we need 4bytes i.e
char c;
char *ptr=&c;
so to save the address of char c we need at least 4 byte right?
now my question is as per C++ standard empty class size is 1byte due to differentiating via address.
So how compile save empty size struct or class memory into 1 byte.
[04:48] Yup, you got me! I thought you meant pointer to a pointer and was waiting for the second one. LoL
I think we can do all or some of them with reinterpret_cast.
Which one is better?
Watching this makes me hate type punning.
I vaguely knew about it's usage, but didn't know it's name.
Fantastic avatar
no pun intended
@9:13 wait I'm confused, shouldn't we have de-referenced here? Like *position[0] ? Or does adding [ ] auto de-reference?
personally i like using c style conversions as they are simply easier to understand