I've a lot of experience with C and padding (specially with bitfields) but your way of teaching is so interesting that you've made me to stay for the entirety of the video
wasted 5 years in uni, studying CS (I failed some unrelated classes, that's why it took me that long), and I learned more by watching your videos and similar things on YT!
14:11 you are mostly correct. essentially, the compiler is guaranteeing alignment of the datatypes involved. Most machines have memory alignment requirements for most datatypes. On x86_64: Chars cannot have one nybble in one address and the other in another address. Shorts must be aligned to 2 bytes. Ints must be aligned to 4 bytes. etc, etc.. Same applies to (hardware-)vector datatypes. (look up vectorized instructions for more info.) So you have rightly observed that chars have a smaller alignment requirement than the pointer, and as such can be packed more effectively. Printing the addresses would've confirmed that. If that behavior is not desirable (for example, if you'd rather pack data as densely as possible) you can use the pack pragma in C/C++. The downside is more costly reads and writes.
bonus: the exception you mentioned is often called a "bus fault" (similar to a segmentation fault). It is an error the cpu catches (or doesn't, depends on the arch) and often calls a vector to handle and cleanup. Linux handles this pretty well even on the Raspberry pi.
Machines have no concept of data types. It operates on words. Registers are wordsized. Memory load reads a word from the memory to a register. Memory store writes a word from a register to the memory. Data types are defined by the programming language standard and enforced by its compilers.
@@Mrkenjoe1 you writing code in C? Probably you doing something silly with your data (I mean variables). If you still have issues with describing what exactly this code is doing and why its here, or if it's here because of "C didn't give me something-something", try to write down what you want this program to do (in user pov, not for programmer) and start again.
After watching this I immediately hate this channel for how many videos you have, how long they are, and that I'm not gonna be able to see them all in one weekend.
I debugged the traditional GC of gofer (Haskell variant) on DOS because it did not use pointers but “cell index” which way too often tainted memory for integer value (with only a few hundred cells available that hurt very much).
Pretty cool video! BTW: If there is a cycle of pointers among several chunks in the heap (it may not necessarily be a two-cycle) your code will enter a never-ending loop until you reach recursion limit and cause a Stack Overflow. A naive solution would be to set a manual recursion limit on mark_region, but then cyclic chunks would never be deallocated.
You should mark as alive only the heap chunks that can be reached from the stack. It will make self-referenced chunks to be collected too (circular pointers). If you take the heap itself as root of the "references tree" circular refs will stay alive forerer, thus leading to memory leaks.
I think you can don’t use builtin gcc function for get the frame address. Just declare on the stack some variable of type uintptr_t and get pointer to it by the “&” operator.
Yep, there are many ways to obtain the stack pointer as discussed in stackoverflow.com/questions/20059673/print-out-value-of-stack-pointer which I linked in the description as the reference. :)
The stack is always aligned to 8 bytes, actually to 16 most of the time The reason is Dat it requires 16 byte alignment for faster access for mnemonics such as movaps, Cuz movups, the unaligned version, is slower and less optimized Many functions internally use movaps and other aligned instructions so the stack is automatically aligned either before the call or on startup (I'm not sure where) But it always is
Although his sessions are not educational per se... But damn they are awesome... I watched this particular one many times.. to figure stuff out for design interviews
I think the struct Foo is 16 bytes instead of 9 because of how GCC does structs. It has a special "packed" mechanic for structs which prevents it from aligning to the full next 8 bytes, but only if you're explicit about it (GCC extension to the C standard). Maybe clang would've behaved differently. Maybe not. But either way, it's the compiler who does the alignment, not the kernel or libc. Tried it out: typedef struct { char i; void *ptr; } __attribute__((packed)) FooPacked; results in 9 bytes instead of 16. So essentially as long as you're on x86_64 GCC you can be 99% certain that the alignment is done for you by GCC if you're not explicitly tell it to do otherwise. I don't know about other architectures so I can't say anything about them.
Actually not really .. C struct size and field offset is strictly defined by rules. One rule defines that pointer type should always be aligned to architecture size. That is C standard well-defined behaviour.
@@user-oe4id9eu4v how does that deny anything I said? If GCC implements the C standard correctly, then "how GCC does structs" effectively means the same as "how the C standard defines how structs should behave ".
@@khuntasaurus88 Operating systems, compilers, interpreters, garbage collectors, assembly, low level networking, advanced machine learning without libraries and being good at doing all of this...
@@koftabalady you have online free book called "Crafting Interpreters" in that you will learn how to make your programming language in C, this will teach you how to create 4 data structures (rope, hashtable with open addressing and simple string hash function, dynamically allocated array and implicit linked list). Additionally you will implement simple mark and sweep garbage collector. Also it will introduce you with the Pratt parsing which is the best parsing method for parsing with precedence. Final product will be garbage collected interpreted programming language that emits its own bytecode, you can also expand on it by making it much more embeddable with C/C++. If you wish to learn about low level networking you should probably check out Unix Sockets Programming by Stevens, W. Richard (3rd edition). This will teach you about IPv4/IPv6 (Internet protocol v4/v6), TCP (Transmission Control Protocol), UDP (User datagram protocol), SCTP (Stream Control Transmission Protocol), etc.. You will be coding this in C also, it basically teaches you all of the stuff there is to know about programming with sockets in C and you will know how these protocols work under the hood. For operating systems I would recommend the book "Operating Systems: Three Easy Pieces" , you should also write few basic programs in assembly (recomending fasm or nasm assemblers) before trying to write your own OS because you will need to write some assembly. I would also reccomend coding along osdev.org and take it easy, creating your own OS is considered the hardest thing you can do as a programmer (depending on which features your OS will have or what hardware it will support, etc...).
@@celdaemon It's a vanilla Emacs and his own customization IMO. Does not feel like any off the shelf Emacs variant. Vim can do a lot of his text naviation and redirecting cmd output into buffer operations, but that is stretching vim's ability a bit. The distinction comes from how the statusbar is rendered in Emacs.
А чтобы не гадать долго, надо было вывести просто дамп занятой памяти структуры в лог, чтобы посмотреть что куда двигается ))) Пишу, посмотрев 18 минут видео
You're only allowed to use global variables, when you know exactly and precisely why you're NOT allowed to use them and all the reasons against using them and can explain the reasons why in detail. Then you qualify. Until then. You aren't allowed. End of discussion :)
As an amateur hobbyist C programmer, I only use global variables when I want something to be absolutely accessible by everyone and not have the pain of including it as an argument to every function I make. Stuff like the heap allocation in the videi
This guy: writes Garbage Collector in C
Me: writes Garbage in C
I'm dying 🤣
how about that hehe
I've a lot of experience with C and padding (specially with bitfields) but your way of teaching is so interesting that you've made me to stay for the entirety of the video
Your garbage collector doesn't work, i'm still here
Same here
Gc : lost in memory, looking for you 😂😂
selfhating treannies in the comments lmao
@@someoneelse5505 ....how did you know?
wasted 5 years in uni, studying CS (I failed some unrelated classes, that's why it took me that long), and I learned more by watching your videos and similar things on YT!
14:11 you are mostly correct.
essentially, the compiler is guaranteeing alignment of the datatypes involved.
Most machines have memory alignment requirements for most datatypes.
On x86_64:
Chars cannot have one nybble in one address and the other in another address.
Shorts must be aligned to 2 bytes.
Ints must be aligned to 4 bytes.
etc, etc..
Same applies to (hardware-)vector datatypes. (look up vectorized instructions for more info.)
So you have rightly observed that chars have a smaller alignment requirement than the pointer, and as such can be packed more effectively.
Printing the addresses would've confirmed that.
If that behavior is not desirable (for example, if you'd rather pack data as densely as possible) you can use the pack pragma in C/C++.
The downside is more costly reads and writes.
bonus: the exception you mentioned is often called a "bus fault" (similar to a segmentation fault).
It is an error the cpu catches (or doesn't, depends on the arch) and often calls a vector to handle and cleanup.
Linux handles this pretty well even on the Raspberry pi.
Machines have no concept of data types. It operates on words. Registers are wordsized. Memory load reads a word from the memory to a register. Memory store writes a word from a register to the memory.
Data types are defined by the programming language standard and enforced by its compilers.
@@heraldo623he specified that the compiler guarantees the alignment
Garbage collector: *collects itself*
"I'm trash, so I took myself out." -The garbage collector probably.
(P.S. this is a joke)
@@Mrkenjoe1 you writing code in C? Probably you doing something silly with your data (I mean variables). If you still have issues with describing what exactly this code is doing and why its here, or if it's here because of "C didn't give me something-something", try to write down what you want this program to do (in user pov, not for programmer) and start again.
Removed
It's like that old Java joke.
If Java had true garbage collection most programs would delete themselves upon execution.
"I am a random person from the internet allowing you to use global variables!"
Nice 😂😂
After watching this I immediately hate this channel for how many videos you have, how long they are, and that I'm not gonna be able to see them all in one weekend.
I debugged the traditional GC of gofer (Haskell variant) on DOS because it did not use pointers but “cell index” which way too often tainted memory for integer value (with only a few hundred cells available that hurt very much).
Pretty cool video! BTW: If there is a cycle of pointers among several chunks in the heap (it may not necessarily be a two-cycle) your code will enter a never-ending loop until you reach recursion limit and cause a Stack Overflow. A naive solution would be to set a manual recursion limit on mark_region, but then cyclic chunks would never be deallocated.
No, it won't, the recursion is only entered if the chunk wasn't reachable before, so it won't try to check in a cycle
You should mark as alive only the heap chunks that can be reached from the stack. It will make self-referenced chunks to be collected too (circular pointers). If you take the heap itself as root of the "references tree" circular refs will stay alive forerer, thus leading to memory leaks.
I think you can don’t use builtin gcc function for get the frame address. Just declare on the stack some variable of type uintptr_t and get pointer to it by the “&” operator.
Yep, there are many ways to obtain the stack pointer as discussed in stackoverflow.com/questions/20059673/print-out-value-of-stack-pointer which I linked in the description as the reference. :)
@@TsodingDaily How could I have missed this? Sorry! :-)
15:06, just bookmarking but good shit. It's pretty interesting
The stack is always aligned to 8 bytes, actually to 16 most of the time
The reason is Dat it requires 16 byte alignment for faster access for mnemonics such as movaps, Cuz movups, the unaligned version, is slower and less optimized
Many functions internally use movaps and other aligned instructions so the stack is automatically aligned either before the call or on startup (I'm not sure where)
But it always is
45:55 there's some Terry Davis energy here
This guy would be a great cast for Riddler in the Batman's universe
In what video did you develop your arena allocator? I cannot find any reference to it in the faq or the repo or anything.
I don't remember. I implemented my own arenas many times in different projects throughout the years, sorry.
Although his sessions are not educational per se... But damn they are awesome... I watched this particular one many times.. to figure stuff out for design interviews
I think the struct Foo is 16 bytes instead of 9 because of how GCC does structs. It has a special "packed" mechanic for structs which prevents it from aligning to the full next 8 bytes, but only if you're explicit about it (GCC extension to the C standard). Maybe clang would've behaved differently. Maybe not. But either way, it's the compiler who does the alignment, not the kernel or libc.
Tried it out: typedef struct { char i; void *ptr; } __attribute__((packed)) FooPacked;
results in 9 bytes instead of 16.
So essentially as long as you're on x86_64 GCC you can be 99% certain that the alignment is done for you by GCC if you're not explicitly tell it to do otherwise. I don't know about other architectures so I can't say anything about them.
Actually not really ..
C struct size and field offset is strictly defined by rules. One rule defines that pointer type should always be aligned to architecture size.
That is C standard well-defined behaviour.
@@user-oe4id9eu4v how does that deny anything I said? If GCC implements the C standard correctly, then "how GCC does structs" effectively means the same as "how the C standard defines how structs should behave ".
sizeof(size_t) != sizeof(intptr_t), though generally it doesn't matter c:
Cool stuff btw. Subscribed.
With struct #pragma pack applies - but that does not apply to the location of data segments
And what if i have in the stack frame some argument passed to some function that is in the range of the heap base address+size?
didn't get, why he made recursion in mark_reachable function?
i can fill heap with random data and get accidental pointer to heap
apparently it's better to have 16 bytes alignment for x64
What is the editor you use in the terminal? great vid by the way! :D
he uses emacs
How can I learn this stuff? can someone give me a simple roadmap or anything?
Learn what stuff
@@khuntasaurus88 Operating systems, compilers, interpreters, garbage collectors, assembly, low level networking, advanced machine learning without libraries and being good at doing all of this...
@koftabalady going through Tsoding videos and pausing to try and implement the solutions yourself before he does may be a good way
@@koftabalady you have online free book called "Crafting Interpreters" in that you will learn how to make your programming language in C, this will teach you how to create 4 data structures (rope, hashtable with open addressing and simple string hash function, dynamically allocated array and implicit linked list). Additionally you will implement simple mark and sweep garbage collector. Also it will introduce you with the Pratt parsing which is the best parsing method for parsing with precedence. Final product will be garbage collected interpreted programming language that emits its own bytecode, you can also expand on it by making it much more embeddable with C/C++.
If you wish to learn about low level networking you should probably check out Unix Sockets Programming by Stevens, W. Richard (3rd edition). This will teach you about IPv4/IPv6 (Internet protocol v4/v6), TCP (Transmission Control Protocol), UDP (User datagram protocol), SCTP (Stream Control Transmission Protocol), etc.. You will be coding this in C also, it basically teaches you all of the stuff there is to know about programming with sockets in C and you will know how these protocols work under the hood.
For operating systems I would recommend the book "Operating Systems: Three Easy Pieces" , you should also write few basic programs in assembly (recomending fasm or nasm assemblers) before trying to write your own OS because you will need to write some assembly. I would also reccomend coding along osdev.org and take it easy, creating your own OS is considered the hardest thing you can do as a programmer (depending on which features your OS will have or what hardware it will support, etc...).
@@koftabaladytry a computer science degree and lots of practice
you wrote a what?? wow, you woke up today and chose violence... :D
Thanks
I FCNG LOVE YOU
Can you implement Ownership like in Rust?
i have a question, how can we install it to JDK?
Я тоже,но на вид это один из лучших языков. Прямая работа с памятью и абсолютная гибкость-привет указателям.
👏👏👏👏
Good thumbnails haha
1:26:56 voidf
9:36 realy weird
I love the way u sarcastically praise other languages 😂
18:34
awesome 👍
which ide is he using
It looks like vim
@@celdaemon It's a vanilla Emacs and his own customization IMO. Does not feel like any off the shelf Emacs variant. Vim can do a lot of his text naviation and redirecting cmd output into buffer operations, but that is stretching vim's ability a bit. The distinction comes from how the statusbar is rendered in Emacs.
the best ide you could ever possibly use in your life
Hey, what graphic editor you used in this video?
the one he implemented himself xD
@@welanduzfullo8496 LOL
Python's GC is garbage, so will it collect itself?
Not really Garbage Collector but nice video.
дико кринжанул с начала видео.
в кратце - чел открыл для себя cache lines и выравнивание памяти.
поздравляю, i guess.
А чтобы не гадать долго, надо было вывести просто дамп занятой памяти структуры в лог, чтобы посмотреть что куда двигается ))) Пишу, посмотрев 18 минут видео
You're only allowed to use global variables, when you know exactly and precisely why you're NOT allowed to use them and all the reasons against using them and can explain the reasons why in detail. Then you qualify. Until then. You aren't allowed. End of discussion :)
explainnnn
I don't know, global variable constants seem to be common enough and is not bad practice if immutable.
As an amateur hobbyist C programmer, I only use global variables when I want something to be absolutely accessible by everyone and not have the pain of including it as an argument to every function I make.
Stuff like the heap allocation in the videi
That was really interesting, thanks for sharing.