Блин чувак, это божественный канал - спасибо за творчество, хороший звук и подачу. Давненько стримы по программированию искал, так тут еще и понятный, родной акцент :)
46:08 "production-ready" allocators store the metadata within the chunks themselves-the size of the chunk, the state of the chunk (in case needed by garbage collection), a pointer to the next chunk, ...etc. I think this kind of list is known as an "intrusive list" or an "embedded list". if you're wondering, you can take a look at GCC's implementation of dynamic allocation of memory. it uses binning where you have bins and each bin has chunks of a certain size (restricted by the bin) and chunks store the metadata using "boundaries", and there are two boundaries, one at the end of the chunk and one at the beginning. it is pretty interesting!
It's better to track the state of blocks in a separate memory from the blocks themselves, especially with lots of blocks or if you add/remove blocks with some frequency.
It depends. If you have a single memory type like tape, SD cards or old HDDs you had it at the start of your chunk. If you have multiple memory types like modern HDD, SSD or on chip architectures you have dedicated memory for blazing fast but tiny meta data and a different type of memory for the big but clunky data itself.
Quick note on fixed sized chunk allocators (or pool allocators), you don't need an additional array to hold the reusable old chunks. You can just hold a pointer to the last freed chunk, and use that "free" memory space to keep a pointer in that chunk to the next one (and so on). They then form a single-linked list containing all re-usable chunks, and you can easily add or remove from the head of the list. Then you just return the head of the list and shorten it by one, or allocate a brand new chunk if the list is empty. All operations should remain O(1), and the only memory cost is 1 pointer for the head of the list.
You say allocate a new chunk if the list is empty but then what happens when you free a block/chunk? It has no way to know what chunk/list it was allocated too because you use that pointer where the data would have been have it been allocated. So then freeing becomes a O(n) problem
@@eklipsed9254Let's say free chunks are represented by a struct like ``` struct free_chunk { free_chunk* next; }; ``` (Though you don't really need a struct, you can use void pointers the whole way through if you like) And there's some data structure representing your heap, which I'll just represent with a pointer ``` free_chunk* next_free; ``` Then, during allocation, you'd unlink the first chunk from the list ``` void* alloc () { // todo: heap init if heap is uninitialized free_chunk* next_chunk = next_free; // if the heap is full, next_free should be NULL If (next_chunk) { next_free = next_chunk->next; } return (void*) next_chunk; } ``` And during deallocation, you'd just link the chunk back into the list. ``` void dealloc (void* chunk) { // Link this chunk at the beginning ((free_chunk*) chunk)->next = next_free; next_free = (free_chunk*) chunk } ``` Note that we blindly overwrite the chunk here. It doesn't matter that the chunk's been overwritten with garbage after it was alloc()'d, you're initializing a new free_chunk in its place independent of any free_chunk it's been before. If you were using void pointers, you would *((void **) chunk) instead of ((free_chunk*) chunk)->next, but I figure using a struct helps conceptually more than a ton of void * casting.
@@eklipsed9254 freeing can still easily be O(1) if you store the chunk information as part of the allocation itself. E.g, the allocation size is sizeof(Chunk) + sizeToAllocate. Then all that free needs to do is: Chunk *chunk = (Chunk*)(ptr - sizeof(Chunk)) and you get back the original chunk.
I wrote my own malloc implementation some time ago. My approach was to make linked lists with a next and "up" pointer, a member specifying whether the memory is free, and the amount of memory "managed" by the node. Memory is obtained by sbrk, mmap or malloc (redundant lol) depending on compile flags, but the default is sbrk(n), which extends the program's data segment by n bytes. A left/right list represents contiguous memory, while up/down are separate calls to sbrk and are therefore not guaranteed to be contiguous. At first, a small block (4KiB IIRC) is allocated for the first row, so the size of the memory being managed is 4096 - sizeof (struct memnode). When a memory request is made, we go up the list from the bottom, left to right, bottom to top, to find the first contiguous chunk of memory that can satisfy the request when the bookkeeping node is factored in. So if a request for 10 bytes is made, then there must be 10 + sizeof (struct memnode) bytes available. If so, then the list is bisected. The first node has its size reduced, and after the last byte of its owned memory, a node is made out of the "difference", then the first node is returned and marked as not free. Memory is obtained by adding memnode size, the bookkeeping node is obtained from a pointer in the free function by subtracting memnode size. If enough memory cannot be found before running into the null pointer, then the list head's up pointer is checked. If it's not null, then the same process continues there and so on until either enough free memory is found or a null pointer is reached in the up list. If all lists are exhausted, then a new pool is made with exactly enough memory to satisfy the request and pushed to the top of the uplist. For freeing, and this is perhaps not the most efficient algorithm, it goes over the whole list starting from the top of the uplist, merges adjacent free nodes, and if head->next == NULL and head->isfree is true, then sbrk is called again to return it to the OS. Since the top of the list is always the most recently allocated, this is safe as long as this malloc is the only malloc that's been used. IIRC the bottom of the list is a static char array rather than something obtained from the heap as a small optimization, so before free calls the deallocator function, it makes sure the node being freed doesn't have the same address as the head of the bottom of the up list. Memory is re-used bottom-up as it becomes available, although as always, fragmentation is an unavoidable issue. Writing my implementation taught me that syscall overhead isn't the only reason not to be loosey goosey with heap allocation. The other issue is that being all over the place with heap allocations will create fragmentation no matter what implementation of malloc you use, resulting in potentially inefficient memory usage. So that's how my implementation works. A circular linked list would have probably worked, although to be completely safe, there'd have to be calls to sbrk(0) and comparisons between pools and their addresses/sizes to sanity check that they really are contiguous and that something else hasn't called sbrk in the interim. As I was writing this, it occurred to me that a next pointer is not even necessary, since the whole list is contiguous. Just add the size of the memory pool to get the location of the next node. I guess that's why they call it a heap, eh? 2n+0 2n+1. Doing this would eliminate the null pointer so the total size of the heap would also have to be tracked. Something like struct memnode { size_t size; uint8_t flags; }; static char *pool; static size_t poolsize; struct memnode * memnode_next(struct memnode *node) { void *ptr = node; size_t limit = poolsize - (ptr - pool) ptr += sizeof *node + node->size; if (ptr - node > limit) return NULL; return ptr; } It'll be interesting to see how someone else does it.
You can actually just use struct memnode { size_t size; }; This is because, you want to allocate memory that is aligned on a word boundary. So if your heap starts at an aligned address, size is always multiple of 2. This also means that the least significant bits of size is unused and hence can be used to store flags. The same optimization is also used in implementations of std::string to track whether SSO is present
@@gnikdroy@BradenBest I just want to comment that as someone who is essentially permanently in the high level abstractions you people are an inspiration.
Nice one! I don't know if this was already mentioned, but the chunk information is usually saved in the beginning of the allocated chunk. In other words, you always allocate the required size + the size of chunk info structure. That's why you always get unique pointer back even if you allocate 0 length chunk. This approach has multiple benefits in term of performance (less redirections) and reduced memory footstep for the heap management.
A few times i was thinking "no don't do it that way, that'll cause a problem later" and then you get to that point and everything you've set up perfectly solves the problem i was thinking about. It's like well told story
dude, knowing this stuff after studying it super hard, and seeing him basically derive everything with a few minutes of thought is crazy. Like how he figured out that the free function should merge the consecutive free slots, without even having encountered an error at all
About that "freed" or "freeed" question, we somehow have this in french. The verb "to create" => "créer", in feminine form (we generally add another e) => "créée". It's uncommonly used but it exists and is grammatically correct. Generally people just write "crée" which confuses everything, but we're used to being confused :)
Awesome!!! The memory manager is an important thing that often requires customization depending on the specifics of the project. You can't trust the compiler everything.
This is a memory pool. I had to implement this sort of thing over a decade ago, where I had to have a hybrid of uniform chunks pools and variable size chunks. Management was a real b!tch, but it was worth the hassle, especially for the relative speed of allocation and the garbage collection at shutdown. Because the code is rather involved, I would never even think to try to do a TH-cam video on it. You're a brave man😁
Isn’t the 0 size basically just padding? Cos Malloc ensures that an allocation is 64 bit aligned. So i would assume that allocating something of zero size is just making a size_t allocation?
i regularly use a memory pool class i made. has dynamic width blocks and chunks, a partition table, automatic defrag, supplies smart pointers, garbage collection, and more. one of the most useful things ive ever made. can use it on every project for fast memory access and no worries about clean up. definitely recommend using one
Well, real malloc is not done like this of course. It uses Linux system calls like sbrk (if you read its description, it's very much like malloc) and on the first call creates a structure of linked free memory blocks which it uses to search for free memory (a great example of malloc implementation is given in the last section of K&R where each block has a Header structure which contains a pointer to the next free block(its header) and the size of this block in bytes)
i think you're correct, you need os calls to get heap memory; first i thought this video would be about that, but a wrapper for different system calls wouldn't have been that interesting (and maybe not that platform independent), so this manages preallocated stack memory like it was a heap memory pool. this way you might get a better idea about how the os itself could manage it in principle though. here the managing structure is separate from the usable memory space like in an indexed filesystem; using linked blocks with header structures has its pros and cons, it's vulnerable against memory corruption since linked lists or trees easily break at their chains, on the other hand the structure itself can be made dynamic in size as well very easily
if you read the linux documentation about the implementation of the heap, it is made with a set of memory arenas, which are linear allocators. You have to realise that sbrk is a syscall, and that doesnt exist on your computer's hardware, it is part of the OS. When you program an OS, you're building it on top of the hardware, you have no OS to build on top of, so technically, the heap doesnt even exist, all you have is access to all of the computer's memory and thats it. The OS reserves a segment of the memory as the heap and then does what he's showcasing which is what we understand as a linear allocator that uses space from the stack. The thing is that the stack / heap concepts only exist when you make user-space software or any kind of software that has to be built on top of an operating system. Otherwise, what kind of black magic would be behind the implementation of sbrk? It cant "just exist", someone had to implemented beforehand so that programmers using an operating system could have an easier life when doing dynamic memory allocations.
All metadata about allocated and free block stuff is kept inside a block in the same heap. You don't need separate memory to track this information. You just alloc block info + size data and put block info data, then return mem + sizeof(block info) pointer back to the app. The free list requires only the head pointer to the first free block, and all free blocks chained in a singly linked list where you store next pointer in a block info.
This video inspired me to get back to C and unmanaged languages in general. A different approach to the problem that, in my opinion, results in simpler code would be to have only one chunk list, and add a "state" variable to Chunk. It's type would be an enumeration, ChunkState with 2 possible values : CHK_FREE and CHK_ALLOCATED. That way, you don't have to perpetually move chunks across lists.
I know I'm late to this video, but can anyone help me understand line 3 in heap_alloc at 15:24? I don't get what happens when you add a char array to a size_t.
An array is a pointer to a memory region, so when you add heap + size, you move the heap pointer by size_t * (size in bytes of the element type of the pointer) Since he wants to work with bytes for his memory segment, he made the memory heap a char array, because chars are 1 byte long, hence incrementing heap by 1 effectively moves the pointer to the next element in the memory segment it points to, and thus, to the next byte. So by adding heap + size, you effectively move the heap pointer by size bytes.
Ah throwback to "Intro to OS" cs mod. We had to compare using bitmaps and linked lists to track metadata for our alloc and free. Also different strategies for searching, like best first, worst first, best after previous, etc. I think linked lists are great for large heaps with big chunks but becomes overkill for small chunks since the bitmap is heap_size/8(chunk_size)
Hey! Initially, you mentioned that you've previously done videos on fixed size chunk allocators. Could you please share the link to that video? It would be really helpful to me.
someone that know programming, why I cant start the for iteration of the main function of the minute 31(aprox) with int i = 1; why I need to start with i = 0; because I saw that the only repetead memory is in the first two iteration with i = 0; and i = 1; after that works properly. Thanks in advance. And thanks you mr Tsoding with you this journey of programming will be more enjoyable, I hope soon could follow your ideas in pforth, I speend some time learning forth and I love it, that combination with python im sure is very cool
Having no fragmented heap is not sus at all: When you deallocate a chunk right after allocating it, but before you allocate the next, the small freed chunk at the end of the chain will be merged with the rest of the heap before allocating the new element. So you reclaim pretty much all freed chunks just in time for the new allocate. Hence no fragmentation. Edit: Ok, you found it out on your own. ;) It was obvious for me from the start. I should be patient and finish watching the video. Cool video. I have been looking to create something similar for a virtual machine, but I cheated and used malloc for the time being.
Shoulda started off using linked list(s) (integrated into your 'heap'). Fixed size arrays of "metadata" are no more than guesstimates. Not everything is a char... Blocks must be allocated aligned to the machine's 'native' 'word' size (to accommodate buffers containing multibyte datatypes.) The user may only want 1 byte now, but the next 'alloc()' will likely ask for a "void **" or a "float"... 'Sign' each block (with a constant, perhaps its own address), and validate while traversing the list(s) to detect the caller's buffer overruns a.s.a.p. You appear to "add more code and arrays" trying to facilitate a bad premise... Yes, theory and O(log n) is wonderful conceptually, but things will take what they take. Adding another layer of processing toward ameliorating a bad start is NOT the solution.
I want to say thank you. I have known about low level programming because of you. I am going to keep watching your video. I have a question. Is there a video about your development settings?
Another good solution instead of the "Heap_Chunk" array would be to litteraly fill the start of allocated chunks with the structure (which would also accept pointer to next chunk, at least), and return to the user the pointer to the rest of the chunk (the end of the structure). Maybe you did this actually, but I'm still at the beginning of the video lol
Beta programmers: Nooo!! Don’t write your own Malloc It’s not worth it!!!! Chad programmers: I must write my own Malloc so that I may understand how and why Malloc works.
Nnnnope. I've been coding in C for over 3 decades but have never had a need for an XOR linked list. I always need random access which precludes XOR lists which require the head for every access.
xor copying is efficient in processors that don't necessarily have a spare register for swapping... but I doubt if that was the case you would want to be using linked lists anyway....
Perhaps a really stupid question that probably boils down to stylistic choices, but for heap_alloc I'm guessing it would be just as efficient in terms of the machine code produced to do if(condition){do stuff} else{return NULL;} as it would be to do if(condition){return NULL;} // do stuff ?
I think, in both cases, the condition gets evaluated. If the function has to return NULL, the first code performs a goto, and the second code doesn't. If the function has not to return NULL, then the first code simply continues his execution, whilst the second code has to perform a "goto" instruction to the "do stuff" code I'm not a machine code professional, but this is how I imagine this code running in machine code... I'm just sharing a thought, I really don't know if I'm right or not.
depends on the compiler's implementation, but most compilers are smart enough to realise that everything after the else could just be pasted afterwards without needing to make an extra jump or anything like that, leading both ways of writing the code to having identical machine code when compiled. It depends on both the situation and the compiler's implementation, as well as optimization levels, thats it. For example, the code that you showed, when compiled using GCC, it generates an extra jump instruction at the end of the if's body in the code where you used the "else" condition, but the other one did not, which would mean that the second code is just slightly faster. But that is if you have all optimizations disabled. If you enable optimizations with "-O", they both generate the exact same instructions BUT, the way that the instructions are grouped is different. In the code where you use the "else", the instructions inside of the else block are grouped under the tag for the block and any code that goes after it will go under its own tag. But in the case where you dont write "else", then the code inside of the else block's body is added under the same tag as the rest of the code that goes after the else condition. In short, using -O gives the exact same instructions, but they are grouped in a different way, which simply means that if you manipulate the program counter to jump to the events after the IF statement in the first case, you will first go through the else block, and in the second case, you will directly go to whatever is after the if, which is literally equivalent in execution.
Pfffffff.......BLOATED ! void* custom_alloc(size_t size) { srand(time(NULL)); // Seed the random number generator every alloc for security purpose return (void*)(rand() % 0xFFFFFFFF); // Return a random pointer } void custom_free(void* ptr) { // For simplicity, do nothing }
I haven't seen that part yet, but normally that's part of a makefile where you pass options to the compiler, like debug settings, defines, etc. All the -w -g -c stuff.
@@адскийдрачила-к9ш мне наоборот так сложнее понимать, ибо я знаю, как произносятся слова в Received Pronunciation и в General American, а здесь либо комбинация, либо неправильное прочтение. Но это, конечно, не в обиду автору видео будет сказано - он просто не учил фонетику
Блин чувак, это божественный канал - спасибо за творчество, хороший звук и подачу. Давненько стримы по программированию искал, так тут еще и понятный, родной акцент :)
Полностью поддерживаю.
Man, I'm not only one who noticed this special letters and sounds in his speach ahah
да мужик красавчик и английский хороший у него
ахахха, да русский ацкент явный, но неплохо говорит
@@mantissaxd
интересно, где он применяет свои знания в C, кем работает
Very interesting; I learned a lot about how memory allocation works today
46:08 "production-ready" allocators store the metadata within the chunks themselves-the size of the chunk, the state of the chunk (in case needed by garbage collection), a pointer to the next chunk, ...etc.
I think this kind of list is known as an "intrusive list" or an "embedded list".
if you're wondering, you can take a look at GCC's implementation of dynamic allocation of memory.
it uses binning where you have bins and each bin has chunks of a certain size (restricted by the bin) and chunks store the metadata using "boundaries", and there are two boundaries, one at the end of the chunk and one at the beginning.
it is pretty interesting!
It's better to track the state of blocks in a separate memory from the blocks themselves, especially with lots of blocks or if you add/remove blocks with some frequency.
It depends. If you have a single memory type like tape, SD cards or old HDDs you had it at the start of your chunk.
If you have multiple memory types like modern HDD, SSD or on chip architectures you have dedicated memory for blazing fast but tiny meta data and a different type of memory for the big but clunky data itself.
You're the best youtuber I've found providing interesting content that's actually useful and helps people learn new things.
bsearch was work correct. The bug was in the return result. You shouldn't have divided the pointer difference by the chunk size.
Yeah, I was just too lazy to troubleshoot that sorry. :D
Quick note on fixed sized chunk allocators (or pool allocators), you don't need an additional array to hold the reusable old chunks. You can just hold a pointer to the last freed chunk,
and use that "free" memory space to keep a pointer in that chunk to the next one (and so on). They then form a single-linked list containing all re-usable chunks, and you can easily add or remove from the head of the list. Then you just return the head of the list and shorten it by one, or allocate a brand new chunk if the list is empty.
All operations should remain O(1), and the only memory cost is 1 pointer for the head of the list.
You say allocate a new chunk if the list is empty but then what happens when you free a block/chunk? It has no way to know what chunk/list it was allocated too because you use that pointer where the data would have been have it been allocated. So then freeing becomes a O(n) problem
@@eklipsed9254Let's say free chunks are represented by a struct like
```
struct free_chunk {
free_chunk* next;
};
```
(Though you don't really need a struct, you can use void pointers the whole way through if you like)
And there's some data structure representing your heap, which I'll just represent with a pointer
```
free_chunk* next_free;
```
Then, during allocation, you'd unlink the first chunk from the list
```
void* alloc () {
// todo: heap init if heap is uninitialized
free_chunk* next_chunk = next_free;
// if the heap is full, next_free should be NULL
If (next_chunk) {
next_free = next_chunk->next;
}
return (void*) next_chunk;
}
```
And during deallocation, you'd just link the chunk back into the list.
```
void dealloc (void* chunk) {
// Link this chunk at the beginning
((free_chunk*) chunk)->next = next_free;
next_free = (free_chunk*) chunk
}
```
Note that we blindly overwrite the chunk here. It doesn't matter that the chunk's been overwritten with garbage after it was alloc()'d, you're initializing a new free_chunk in its place independent of any free_chunk it's been before.
If you were using void pointers, you would *((void **) chunk) instead of ((free_chunk*) chunk)->next, but I figure using a struct helps conceptually more than a ton of void * casting.
@@eklipsed9254 freeing can still easily be O(1) if you store the chunk information as part of the allocation itself. E.g, the allocation size is sizeof(Chunk) + sizeToAllocate.
Then all that free needs to do is: Chunk *chunk = (Chunk*)(ptr - sizeof(Chunk)) and you get back the original chunk.
@@eklipsed9254 en.wikipedia.org/wiki/Free_list
I wrote my own malloc implementation some time ago.
My approach was to make linked lists with a next and "up" pointer, a member specifying whether the memory is free, and the amount of memory "managed" by the node. Memory is obtained by sbrk, mmap or malloc (redundant lol) depending on compile flags, but the default is sbrk(n), which extends the program's data segment by n bytes. A left/right list represents contiguous memory, while up/down are separate calls to sbrk and are therefore not guaranteed to be contiguous. At first, a small block (4KiB IIRC) is allocated for the first row, so the size of the memory being managed is 4096 - sizeof (struct memnode). When a memory request is made, we go up the list from the bottom, left to right, bottom to top, to find the first contiguous chunk of memory that can satisfy the request when the bookkeeping node is factored in. So if a request for 10 bytes is made, then there must be 10 + sizeof (struct memnode) bytes available. If so, then the list is bisected. The first node has its size reduced, and after the last byte of its owned memory, a node is made out of the "difference", then the first node is returned and marked as not free. Memory is obtained by adding memnode size, the bookkeeping node is obtained from a pointer in the free function by subtracting memnode size. If enough memory cannot be found before running into the null pointer, then the list head's up pointer is checked. If it's not null, then the same process continues there and so on until either enough free memory is found or a null pointer is reached in the up list. If all lists are exhausted, then a new pool is made with exactly enough memory to satisfy the request and pushed to the top of the uplist.
For freeing, and this is perhaps not the most efficient algorithm, it goes over the whole list starting from the top of the uplist, merges adjacent free nodes, and if head->next == NULL and head->isfree is true, then sbrk is called again to return it to the OS. Since the top of the list is always the most recently allocated, this is safe as long as this malloc is the only malloc that's been used.
IIRC the bottom of the list is a static char array rather than something obtained from the heap as a small optimization, so before free calls the deallocator function, it makes sure the node being freed doesn't have the same address as the head of the bottom of the up list. Memory is re-used bottom-up as it becomes available, although as always, fragmentation is an unavoidable issue. Writing my implementation taught me that syscall overhead isn't the only reason not to be loosey goosey with heap allocation. The other issue is that being all over the place with heap allocations will create fragmentation no matter what implementation of malloc you use, resulting in potentially inefficient memory usage.
So that's how my implementation works. A circular linked list would have probably worked, although to be completely safe, there'd have to be calls to sbrk(0) and comparisons between pools and their addresses/sizes to sanity check that they really are contiguous and that something else hasn't called sbrk in the interim.
As I was writing this, it occurred to me that a next pointer is not even necessary, since the whole list is contiguous. Just add the size of the memory pool to get the location of the next node. I guess that's why they call it a heap, eh? 2n+0 2n+1. Doing this would eliminate the null pointer so the total size of the heap would also have to be tracked. Something like
struct memnode {
size_t size;
uint8_t flags;
};
static char *pool;
static size_t poolsize;
struct memnode *
memnode_next(struct memnode *node)
{
void *ptr = node;
size_t limit = poolsize - (ptr - pool)
ptr += sizeof *node + node->size;
if (ptr - node > limit)
return NULL;
return ptr;
}
It'll be interesting to see how someone else does it.
You can actually just use
struct memnode { size_t size; };
This is because, you want to allocate memory that is aligned on a word boundary. So if your heap starts at an aligned address, size is always multiple of 2. This also means that the least significant bits of size is unused and hence can be used to store flags.
The same optimization is also used in implementations of std::string to track whether SSO is present
@@gnikdroy@BradenBest I just want to comment that as someone who is essentially permanently in the high level abstractions you people are an inspiration.
Nice one! I don't know if this was already mentioned, but the chunk information is usually saved in the beginning of the allocated chunk. In other words, you always allocate the required size + the size of chunk info structure. That's why you always get unique pointer back even if you allocate 0 length chunk. This approach has multiple benefits in term of performance (less redirections) and reduced memory footstep for the heap management.
This is the only video that has a clickbaity title, but then the content is even more extreme than the title
True.
Is that legal?
A few times i was thinking "no don't do it that way, that'll cause a problem later" and then you get to that point and everything you've set up perfectly solves the problem i was thinking about. It's like well told story
what are the benefits of using sizeof(list->chunks[0]) instead of sizeof(Chunk) at 1:12:05?
Makes it more tolerant to refactoring.
Wrote this as a system design course assignment. Pain in the ass.
dude, knowing this stuff after studying it super hard, and seeing him basically derive everything with a few minutes of thought is crazy. Like how he figured out that the free function should merge the consecutive free slots, without even having encountered an error at all
About that "freed" or "freeed" question, we somehow have this in french.
The verb "to create" => "créer", in feminine form (we generally add another e) => "créée". It's uncommonly used but it exists and is grammatically correct.
Generally people just write "crée" which confuses everything, but we're used to being confused :)
As a french person this tripped me up more than once :D
>tfw your language is so confusing that even its natives don't even know how to write it correctly
Awesome!!! The memory manager is an important thing that often requires customization depending on the specifics of the project. You can't trust the compiler everything.
This is a memory pool. I had to implement this sort of thing over a decade ago, where I had to have a hybrid of uniform chunks pools and variable size chunks. Management was a real b!tch, but it was worth the hassle, especially for the relative speed of allocation and the garbage collection at shutdown.
Because the code is rather involved, I would never even think to try to do a TH-cam video on it. You're a brave man😁
A personal custom made foot gun
These kinds of exercises is why I’m finding that I’m getting better at understanding C and how the computer operates at a fundamental level!!
"helloow" *oh, nice, this will be a relaxed asrm tutorial "*procede to write at lightning fast speed*" oh shit...
Isn’t the 0 size basically just padding? Cos Malloc ensures that an allocation is 64 bit aligned. So i would assume that allocating something of zero size is just making a size_t allocation?
i regularly use a memory pool class i made. has dynamic width blocks and chunks, a partition table, automatic defrag, supplies smart pointers, garbage collection, and more. one of the most useful things ive ever made. can use it on every project for fast memory access and no worries about clean up. definitely recommend using one
Well, real malloc is not done like this of course. It uses Linux system calls like sbrk (if you read its description, it's very much like malloc) and on the first call creates a structure of linked free memory blocks which it uses to search for free memory (a great example of malloc implementation is given in the last section of K&R where each block has a Header structure which contains a pointer to the next free block(its header) and the size of this block in bytes)
i think you're correct, you need os calls to get heap memory; first i thought this video would be about that, but a wrapper for different system calls wouldn't have been that interesting (and maybe not that platform independent), so this manages preallocated stack memory like it was a heap memory pool. this way you might get a better idea about how the os itself could manage it in principle though. here the managing structure is separate from the usable memory space like in an indexed filesystem; using linked blocks with header structures has its pros and cons, it's vulnerable against memory corruption since linked lists or trees easily break at their chains, on the other hand the structure itself can be made dynamic in size as well very easily
Yes the approach is called first fit allocator. It’s been a while a read the book and here I am :D
if you read the linux documentation about the implementation of the heap, it is made with a set of memory arenas, which are linear allocators. You have to realise that sbrk is a syscall, and that doesnt exist on your computer's hardware, it is part of the OS. When you program an OS, you're building it on top of the hardware, you have no OS to build on top of, so technically, the heap doesnt even exist, all you have is access to all of the computer's memory and thats it. The OS reserves a segment of the memory as the heap and then does what he's showcasing which is what we understand as a linear allocator that uses space from the stack. The thing is that the stack / heap concepts only exist when you make user-space software or any kind of software that has to be built on top of an operating system. Otherwise, what kind of black magic would be behind the implementation of sbrk? It cant "just exist", someone had to implemented beforehand so that programmers using an operating system could have an easier life when doing dynamic memory allocations.
@@realCevrait's not stack memory, I think it's static
@@AlFredo-sx2yybut the processors have dedicated registers for the stack, right?
All metadata about allocated and free block stuff is kept inside a block in the same heap. You don't need separate memory to track this information. You just alloc block info + size data and put block info data, then return mem + sizeof(block info) pointer back to the app. The free list requires only the head pointer to the first free block, and all free blocks chained in a singly linked list where you store next pointer in a block info.
640K should be enogh for everyone. video started with a meme.
This video inspired me to get back to C and unmanaged languages in general.
A different approach to the problem that, in my opinion, results in simpler code would be to have only one chunk list, and add a "state" variable to Chunk. It's type would be an enumeration, ChunkState with 2 possible values : CHK_FREE and CHK_ALLOCATED.
That way, you don't have to perpetually move chunks across lists.
I know I'm late to this video, but can anyone help me understand line 3 in heap_alloc at 15:24? I don't get what happens when you add a char array to a size_t.
An array is a pointer to a memory region, so when you add heap + size, you move the heap pointer by size_t * (size in bytes of the element type of the pointer)
Since he wants to work with bytes for his memory segment, he made the memory heap a char array, because chars are 1 byte long, hence incrementing heap by 1 effectively moves the pointer to the next element in the memory segment it points to, and thus, to the next byte.
So by adding heap + size, you effectively move the heap pointer by size bytes.
@@soulysouly7253 Thank you!
that asmr in the beginning 🥴
I've always wanted to learn C, you're my hero
In chunk_list_find, you should replace `(result - list->chunks) / sizeof(list->chunks[0])` just with `result - list->chunks`
Ah throwback to "Intro to OS" cs mod. We had to compare using bitmaps and linked lists to track metadata for our alloc and free. Also different strategies for searching, like best first, worst first, best after previous, etc.
I think linked lists are great for large heaps with big chunks but becomes overkill for small chunks since the bitmap is heap_size/8(chunk_size)
Your malloc should return aligned addresses, and of course the chunk metadata is inline on the heap (which solves your size0 problem)
Hey!
Initially, you mentioned that you've previously done videos on fixed size chunk allocators. Could you please share the link to that video? It would be really helpful to me.
what program do you use for coloring the man pages?
Showing an implementation using free-list allocation would be easier and more practical.
there could be bool value in the struct indicating whether u are using that chunk or not
someone that know programming, why I cant start the for iteration of the main function of the minute 31(aprox) with int i = 1; why I need to start with i = 0; because I saw that the only repetead memory is in the first two iteration with i = 0; and i = 1; after that works properly. Thanks in advance. And thanks you mr Tsoding with you this journey of programming will be more enjoyable, I hope soon could follow your ideas in pforth, I speend some time learning forth and I love it, that combination with python im sure is very cool
Bro uses calender in terminal
12:07 my motivation!!!
Funny that my homework at uni is to implement malloc, calloc, realloc, free , mmap and more and then this vid pooped into my recommended.
*12:30 in fact I'd argue you cant. Any working, complex system evolved from a simple, working system
Having no fragmented heap is not sus at all: When you deallocate a chunk right after allocating it, but before you allocate the next, the small freed chunk at the end of the chain will be merged with the rest of the heap before allocating the new element. So you reclaim pretty much all freed chunks just in time for the new allocate. Hence no fragmentation.
Edit: Ok, you found it out on your own. ;) It was obvious for me from the start. I should be patient and finish watching the video.
Cool video. I have been looking to create something similar for a virtual machine, but I cheated and used malloc for the time being.
By the time I get my code to 1:33:50, I’m getting print out of size 0 and then “Index is 0” 4 times. Why?
0:47 I see what you did there
I love you man. Please never stop being yourself
Why does he have a porn folder of 9.1 gig showing on the screen.Is that supposed to be a practical joke?
A semester's worth of learning in one video. -:) Thank you
How many fingers do you have for typing. lol
dont have time to finish to video but im curious to see how things accessed outside of heap are unreacxhable
Shoulda started off using linked list(s) (integrated into your 'heap'). Fixed size arrays of "metadata" are no more than guesstimates.
Not everything is a char... Blocks must be allocated aligned to the machine's 'native' 'word' size (to accommodate buffers containing multibyte datatypes.)
The user may only want 1 byte now, but the next 'alloc()' will likely ask for a "void **" or a "float"...
'Sign' each block (with a constant, perhaps its own address), and validate while traversing the list(s) to detect the caller's buffer overruns a.s.a.p.
You appear to "add more code and arrays" trying to facilitate a bad premise... Yes, theory and O(log n) is wonderful conceptually, but things will take what they take. Adding another layer of processing toward ameliorating a bad start is NOT the solution.
nice uh... folder. PepeHands
I want to say thank you.
I have known about low level programming because of you. I am going to keep watching your video.
I have a question.
Is there a video about your development settings?
I failed to find this when doing my assignment that required me to do the same task 😢.
Better use mimalloc, jemalloc or TCMalloc. They all basically work the same and coudn't been easily beaten without chosing the same basic principles.
Another good solution instead of the "Heap_Chunk" array would be to litteraly fill the start of allocated chunks with the structure (which would also accept pointer to next chunk, at least), and return to the user the pointer to the rest of the chunk (the end of the structure). Maybe you did this actually, but I'm still at the beginning of the video lol
Beta programmers: Nooo!! Don’t write your own Malloc It’s not worth it!!!!
Chad programmers: I must write my own Malloc so that I may understand how and why Malloc works.
Nnnnope. I've been coding in C for over 3 decades but have never had a need for an XOR linked list. I always need random access which precludes XOR lists which require the head for every access.
Those 9.1Gb 🤣🤣
Metaforando?
Collection is growing. You can check previous videos.😂
enjoying watch your videos, keep on that great.
xor copying is efficient in processors that don't necessarily have a spare register for swapping... but I doubt if that was the case you would want to be using linked lists anyway....
Very good presentation, very informative.
This question was asked in Qualcomm, Not able to do this.
I sucked 😥😥
isn't memory alignment necessary here?
30:00
I’m wayyy too high for this right now.
How did you make it look so big where the code is visible?
bruh the folder name is crazy
p#rn stash
if you werent allowed to use static local variables and global variables how would you change this ?
26 mins into the video and i am LOST!!!!
Is this data really going to heap, or some other space?
The heap array is on stack
Awesome video!!! Thanks :)
Perhaps a really stupid question that probably boils down to stylistic choices, but for heap_alloc I'm guessing it would be just as efficient in terms of the machine code produced to
do
if(condition){do stuff}
else{return NULL;}
as it would be to do
if(condition){return NULL;}
// do stuff
?
I think, in both cases, the condition gets evaluated.
If the function has to return NULL, the first code performs a goto, and the second code doesn't.
If the function has not to return NULL, then the first code simply continues his execution, whilst the second code has to perform a "goto" instruction to the "do stuff" code
I'm not a machine code professional, but this is how I imagine this code running in machine code...
I'm just sharing a thought, I really don't know if I'm right or not.
depends on the compiler's implementation, but most compilers are smart enough to realise that everything after the else could just be pasted afterwards without needing to make an extra jump or anything like that, leading both ways of writing the code to having identical machine code when compiled. It depends on both the situation and the compiler's implementation, as well as optimization levels, thats it.
For example, the code that you showed, when compiled using GCC, it generates an extra jump instruction at the end of the if's body in the code where you used the "else" condition, but the other one did not, which would mean that the second code is just slightly faster. But that is if you have all optimizations disabled.
If you enable optimizations with "-O", they both generate the exact same instructions BUT, the way that the instructions are grouped is different. In the code where you use the "else", the instructions inside of the else block are grouped under the tag for the block and any code that goes after it will go under its own tag. But in the case where you dont write "else", then the code inside of the else block's body is added under the same tag as the rest of the code that goes after the else condition.
In short, using -O gives the exact same instructions, but they are grouped in a different way, which simply means that if you manipulate the program counter to jump to the events after the IF statement in the first case, you will first go through the else block, and in the second case, you will directly go to whatever is after the if, which is literally equivalent in execution.
Could you help me write my own C code? This was helpful for understanding it, but my assignment is slightly different
isn't this what virtual machine like JVM does?
Meowy uwu I'm ill rn and that surprisingly is some great, most importantly long content to watch while I'm dying xd
Pfffffff.......BLOATED !
void* custom_alloc(size_t size) {
srand(time(NULL)); // Seed the random number generator every alloc for security purpose
return (void*)(rand() % 0xFFFFFFFF); // Return a random pointer
}
void custom_free(void* ptr) {
// For simplicity, do nothing
}
What does CFLAGS mean
I haven't seen that part yet, but normally that's part of a makefile where you pass options to the compiler, like debug settings, defines, etc. All the -w -g -c stuff.
1:44:17 okay
Have you tried making a stop and copy garbage collector yet? it's pretty simple
1:01:08
I guess you used binary search but never sorted the list first?
font?
Times Old Greek
кайф для мозга
well done
come to brazil, big fan here
why not just do
#define unused __attribute__((unused))
void func(unused int a)
{
//...body
}
instead of
(void) var;
??
The latter is 1) shorter and 2) supported by all compilers
very cool
Very interesting, thanks for sharing!
Starts at 14:20
Maria
Get a pop filter for the mouth noises please
freed's feed and seed
(sorry)
1:28:40 wtf lmao
Deusdedit
Great
у тебя очень сильный акцент :)
Зато контент очень интересный)
а я думаю, почему он так четко говорит и почему я наконец-то могу более менее слышать что говорят англо-говорящие
@@адскийдрачила-к9ш мне наоборот так сложнее понимать, ибо я знаю, как произносятся слова в Received Pronunciation и в General American, а здесь либо комбинация, либо неправильное прочтение. Но это, конечно, не в обиду автору видео будет сказано - он просто не учил фонетику
tu eh mt pika
I love you