In 1983, the C programming language did *not* have a void* type. We used char* and at least some of us used a typedef to indicate I really had unspecified stuff (I used 'ambi'). The void and void* types, along with features like allowing different structures to have members of the same name, were added later.
@@joshodom9046 Calling exit() or abort () was possible. You could just ignore return values, they were by default the last statements expression value and the default function type was int, like register. I cannot remember declaring a void function ever catching a bug in practice. The main issue I remember was main() returning random error statuses in unfinished programs rather than success.
@@kidrigger Would you say that Moden c++ design is outdated for post 2011 C++? I saw it was released in 2001. Do you think a modern C++ programmer can still benefit from it?
@@andreashadjiantonis2596 Haven’t read the book but given that constexpr and consteval now exist I suspect that much of those magic templares are no longer needed.
This was one of my favorite talks, and I've seen a lot! He is very informative and at the same time pretty funny. And he is also the first person who successfully tought me that *int is a type*. The brainfk was huge when I finally realized it. xD Best man!
Love it, what I always say: "I'm average smart, that is why I produce simple things, and I get pissed when smart people produce unnecessarily complicated things".
1:12 - you don't need to keep the file and line number, it is enough to keep the return address (__builtin_return_address on gcc), you can then get the source/line number from debug information (addr2line).
This is great! I'm a huge fan of Andrei. He gave a similar talk about allocators in D. I was hoping for a C++ treatment of the same topic. What's with the comments about no profilers on OS X? I don't know why he would think that. Does Instruments not fit the bill?
operator new is a function essentially the same as malloc. No type information. It does not call constructors. You mean to refer to a "new expression" which uses the keyword and has complex behavior implemented by the compiler, which includes calling operator new, checking the result, and calling the constructors.
+John Długosz Isn't that like saying operator+ is a function essentially the same as nop, and an "addition expression" uses the + operator and has complex behaviour implemented by the compiler?
+Nameguy not at all. The addition expression invokes operator+ and that's all, other than knowing precedence. a "new expression" calls operator new, *and* calls the constructor, *and* has a try/catch which calls operator delete. call operator new directly and you'll see it is a small part of what writing 'new C (args)' does. it doesn't even have the same return type. how is that remotely anything like the case of + ?
+Nameguy Please google "Difference between 'new operator' and 'operator new'", there was a nice thread on stackoverflow about the difference between these two things.
Andrei makes it sound like goodSize is useful for wrapping the argument to allocate, but why not just have allocators return the good size (the actually allocated size, and not the requested size) in Blk, which callers can read to find out how much they really got? Is there any other use for goodSize? Any reason to know goodSize without calling allocate?
Upon further investigation, there is a serious drawback to returning the actual allocated size in Blk: it makes using the allocator harder. Consider if I wanted to allocate memory for a single int: int *p = allocator.allocate(sizeof(int)).ptr; allocator.deallocate({ p, sizeof(int) }); By always returning the originally requested size in Blk, you make it so that we no longer have to carry around the information about the size of the allocated block, since we know it statically to be sizeof(int).
It is because of composability. You might have an allocator that introspect various allocators (at runtime) and determines which one is the best depending what caller wants. Compiler can inline also any calls to goodSize, and in some cases optimize the entire determination to simple check or arithmetic. Also returning something else in the Blk that what was asked for can make things hairy in deallocate.
It looks to me like the freelist allocator has the fallback allocator baked in. I would think the whole point of designing for composition is that you can separate those two parts. The simpler free list allocator then becomes: allocate(size) { if (size is wrong or free list is empty) return null; else return head of free list } owns(p) { p.size == mysize } deallocate(p) { assert(p.size == mysize); prepend p onto the free list; } When combined with a fallback allocator, you should get Andrescu's free list allocator.
Finally someone shows the elephant in the room, those deallocators aren't free, people seem to think GC is bad and expensive and they never take care to notice the freaking linked-list heavy infrastructure that's consuming time after their method ran.
@@monad_tcp It seems that when you deallocate you potentially add a pointer to a list which is O(1) and when you allocate and remove something from the free list that is also O(1). The real cost is that the memory may be cold so you loading the data cache to do allocation or deallocation. This is a bad thing but, as pointed out in the talk, this is still a high performance memory strategy if you use in some local setting where the memory used is all hot.
Isn't this because it makes no sense to make a freelist allocator the second part of the composer -- you'd always just use the first part. So by baking the parent in, you enforce it to always be the first part of the composed allocator.
Using the generic fallback for Freelist feels more fragile - especially returning `owns` true for allocations you didn't make, just because you know that the Fallback code will then call you for deallocation. It's more code reuse, but not really more 'composition'. And as @@mmaxio says you couldn't use it any other way, you aren't adding flexibility by separating it out
51:33 this line contains a bug: return b.length == s || parent_.owns(b); In a FallbackAllocator, the FreeList will falsely claim to own a block that really belongs to X, if the size happens to match.
Yeah, I think the ownership model is interesting, but as implemented in the examples, it certainly was not well formalized and could easily lead to improper compositions.
That's because if you try to allocate something with size *s* then it's the FreeList allocator that's going to take precedence and allocate the object. The X allocator will never get the chance of allocating anything of size *s* and therefore never own anything of size *s.*
On slide 29 the `owns` should have && instead of ||, no? The size needs to be the same and it needs to come from the parent allocator. Also, this assumes that no two FreeList for the same size will be created with the same parent allocator.
This is because of first check is most likely cheaper than delegating to parent_.owns() function. So it is an optimization in case the parent_.owns() is expensive. Realistic FreelistAllocator will have a bit more complexity in all these functions actually. Take a look for example at this: github.com/dlang/phobos/blob/master/std/experimental/allocator/building_blocks/free_list.d#L705 , adding support for batch allocation makes it even more interesting.
I had to implement such allocators in C for a "standard" library for embedded systems in my company and man I wish I had C++ templates for that. I had to use macro-based templates instead which I wish on no enemy.
'allocators should only care about size, alignment' ... but the type could give hints as to useage, or lifetime (as defaults, and you could override that with explicit hints)
Allocators seems like the wrong level of abstraction to be concerned with type, a level up from that, e.g. a container that uses an allocator, is where I might care about types. Useage hints sounds like a good idea though, like 'try to allocate this close to where I allocated this other data' or 'make this X byte aliigned', but the type itself seems like a poor metric to make these kinds of decisions.
The things he describes do not integrate well with the current STL implementation. In fact, that is one of the keystone statements. The next iteration of the STL (aka. STL2) is currently being designed which will be a complete revamp of everything. The STL has many historic garbage in it that are rooted too deeply into the entire system. Allocators for one is one such entity. You cannot change its API without having to touch just about everything.
just found the hole in my argumentation: you don't get size_t as an input in free() in an std::allocator, do you? so you'd have to keep track of that seperately. This sucks >.
All those compositions incur extra overhead as all the checking as to which allocator owns a pointer or what allocator to use happen at runtime. This kind of stuff should be done during compile time , but that makes it impossible to allocate arrays whose length is only known at runtime.
These are templates, it will all inline into extremely efficient code actually. The performance will be as good, or actually better, than writing similar allocator by hand. Guaranteed.
the type information of the pointer being (de-)allocated should be enough to accelerate the look up. In case of segregate-allocator for example a binary search using the size of the type can be used as mentioned in the talk.
You'd have to write wrapper code to "unpack" the struct Blk into size_t and void*. Maybe some day std::allocator will be gone and something like this will be added to the standard, but until then, you'd have to write a wrapper. You could also write a std::allocator that does only one thing: pass stuff on to an instance of YourComposableAllocatorType (YCAT). You actually use the YCAT instance for your code, but "dumb" code like std::something can use the std::allocator as a shim.
am here because I wanted to implement my own std containers (fake std), and didn't knew how vector managed to realloc because new/delete operators can't be combined with realloc/calloc/malloc/free operations. Now I know a byte more about allocators-
On the slide on minute 10, I don't understand what is meant by "no communication with the constructor". A call to new with a user-defined type calls the appropriate constructor of that type. Does anybody understand what Andrei wants to convey with this point?
when using the new expression, the constructor call is part of the syntax (the parenthesis) so the allocator itself has no idea of what constructor you want to call after his job
This was great high level info but I think I missed something crucial/obvious. How do I actually allocate the memory without malloc or new? Those functions don't look like they do much to me.
You'll ultimately have to rely on malloc, new, or perhaps some fixed size array (well, if you're staying within the confines of the standard). The point was more about how the allocator interface should have been designed to easily compose and customize their behaviour. But the first building blocks of your allocator will be things like Alexandrescu's Mallocator that merely call malloc/free (or something calling new/delete instead, or perhaps something that provides memory from a fixed-sized automatically allocated std::array, ... c.f. 35:46) In fact, Andrei Alexandrescu gives an example at 1:08:49 : ultimately, A, Bucketizer, or FreeList all rely on Mallocator to do the actual allocation. If you want to allocate memory dynamically without calling malloc or new, you'll certainly be using OS-specific system calls that obviously won't be defined in the C++ standard (e.g. for linux, things like brk, sbrk or even mmap). But that wasn't really the point of the talk (although you could encapsulate your OS-specific allocator as described in that talk).
there are OS-specific APIs for different OSes I believe. I am not very experienced, but I have certainly seen a few syscalls and stuffs, and malloc() does actually do decent amount of things under the hood
VirtualAlloc on windows, mmap on macos/linux. You just ask the operating system for memory however it exposes that functionality! You're running on an operating system after all!
All of them use malloc/new. If you google StackAllocator or the Free List allocator for example you'll notice they all rely on getting a block of memory first from malloc/whatever platform API. For the Stack Allocator example, when he 'deallocates' the memory, you just move a pointer in an already allocated chunk, and all of that chunk still stays allocated. Other processes can't use it still. But it's 'deallocated' in a different sense in regards to the StackAllocator's state.
That's got to be the only good way. His talk left me wondering how an allocators could implement owns() without all the same problems of keeping size information.
The issue there is how would you store the fallback allocator? You should be able to call the templated allocator yourself instead of going via the fallback allocator. As he mentioned, allocator have to store state for good allocation, so owns is the logical choice.
The allocator will have to handle alignment one way or another, taking it in as a type parameter or as a direct and explicit alignment paremeter doesn't change anything in that regard. There are no benefits from making your allocator aware of a type (There are cases where you *dont* care about it, like when allocating a chunk of memory that is going to be used by another allocator)
Is it advisable to have all these template instantiations of the same code that only differs in template parameters that are integers? Won't your code get bloated?
it would definitely result in larger binaries, but the code itself would be faster (depending on the compiler of course). Modern compilers should also optimize away unreachable code paths.
It is a good exercise for a brain, but also it has so many misleading points. First of all, why free function work this way? Because you can't trust user. It can send you correct pointer and wrong size (or vice versa, or both wrong), consequences of this are unpredictable and probably will lead to really strange bugs in allocator. Also free is not taking pointer to the beginning of allocated block, any address within this block is accepted. Without the latter you simply could not deallocate by pointer to superclass of an object (unless it is the first superclass), or does the speaker suggest to use dynamic type lookup to deallocate? Modern malloc do many of the mentioned optimizations. Why would you need to reinvent it in your code? I cannot imagine situation when you really need such an universal allocator. Is it going to be a shared allocator for certain objects? Well, what about multithreading then? Adding mutex there will make this allocator slower than direct call to malloc. Usually what you neeed is allocator of the fixed size. It can be done without these composition tricks which are not free.
The problem with "modern malloc" and many of them implementing various optimizations. You have no idea how it works internally, can be platform, compiler and version dependent. These composition tricks do pay off actually. The library Andrei wrote implements everything that is in jemalloc (high performance allocator with good scalability and low fragmentation), in 1/5 of the lines of code (no joke), and it can be easily tuned or adapted on the fly, for specific parts of code, or for some threads, or whatever.
Actually looking at the what it would take to implement something, and it seems like there is a crucial part missing. Pointer comparisons are not well-formed expressions in C++, unless the pointers come from the same object or array. That some of the owns operations use this comparison does not seem at first look to be a path down cross-platform or compiler use. Perhaps something like the StackAllocator may work frequently, since the stack addresses are more well ordered, but even then it is not guaranteed it seems. For more complex allocators, the owns functions or any comparisons may not work too well. Anybody have any thoughts on this?
Hey Matias, I’m not sure what you mean by “Pointer comparisons are not well-formed expressions in C++”, it is valid in C++ to test if two pointers are equal, this is done regularly for container objects like sets or maps, where you often want the container to hold pointers as their value type. In general, it can be very convenient to use pointers to represent object identity.
@@maryammurphy2599 It is valid in C++ to test if two pointers are equal, but it's not valid to test whether a pointer is bigger or smaller than another pointer **unless** they come from the same allocation. I've been stuck at this as well and would be happy if anyone shed any light onto this and came with suggestion how to implement the allocators that currently rely on this.
@@shroudednight I remember going into the rabbit hole on this a while back and std::less still has some caveats. For example it's just defined to provide *some* total ordering, but on segmented memory architectures, it's not a guarantee this will be what you expect. Also when comparing pointers, the compiler is free to assume they came from the same memory buffer, which with the owns API may not be the case so some care has to be taken to write this in a way that casts the pointers to uintptr_t. Basically, each allocator is nice separately, but I'm still not sure owns is the right semantics to allow for full composition of the types. For example how would a Fallback implement deallocation? If a pointer came from the FreeList, when the time came to delete, how would Segregated implement owns properly to ensure that all pointers get deleted by the allocator they came from? The FreeList should say it owns blocks of size S, but those may have come from Malloc actually in the backing stack for the FreeList expires. Instead it would be better if the allocation had some byte metadata to track if the primary or fallback was used and then segregated should already know which one was used based on the size.
Can anyone answer me a question? I'm not very experienced in c++ and this talk left me a doubt... Say that I implement all of these allocators...however I could have a class that contains an std::list...this will be probably allocated with mallocs, how are these allocator going to help me with that?
+KapoGames In fact, this talk is more about building utility allocators than performant allocators. If you're interested in optimizing some data structures like lists, you could have a look at the memory pool allocator pattern, or the small object allocator from the loki library (loki-lib.sourceforge.net/), the support library for the "Modern C++ design book", which is a great book btw. You also need to know what traits an allocator should match to be used with stl containers, you can have a look at this page : en.cppreference.com/w/cpp/memory/allocator_traits Then you can apply these allocators to the stl containers like std::list, like this for example : std::list myList; The list will no longer use the default allocation strategy, but the allocation strategy of your passed allocator type. Please also note that some containers already are really fast with the default malloc, due to the way they works, like std::vector. As for std::list, I remember implementing a memory pool allocator and improving insertion and deletion performances, because of the elimination of dynamic allocations. Hope that helps !
Thanks! My problem was actually much deeper, I didn't get the idea that an allocator should just provide the memory, and I didn;'t know how to use smart pointers to call the deallocation on the allocators. However in the last days I managed to understand this, refactor all my project, and build 3 different allocators! Now your answer is interesting, because three days ago I would have not undertand it, now it brings me to the next steps :) One more question...I was already thinking of buying modern c++ desing...would you recommend it in 2015, or there is some new book with similar content and newer stuff?
I may not be the best to anwser this question actually, as I'm not a C++ expert either :) But I still find "Modern C++ design" interesting today, because it gives a lot of highly reusable building blocks and technics. In fact, I might be affraid that today's litterature on real modern C++, aka C++14 and beyond, is still quite limited. The reason is that the language is evolving really fast these days, and lot of things are invalidated very quickly, so writing a book about modern C++ features may actually be quite a loss of time until things stabilize a bit (when we'll get everything planned for C++17 I think). But yes, there's ONE thing you might want to read if you want almost not arguably good C++ practices, which is, you might know, the C++ core guidelines : github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md This is not really meant to be read btw, but trust me, there's so much interesting thing in this, you might want to consider reading some rules each day, and applying them after you understood why they're here. Also, the GSL (guideline support library) provides quite useful (but right now limited) features. Then, I think C++ is really all about reading (like on experts blogs, or extremly detailed answers on stackoverflow) and practice, and some talks here and them is always good, but you really need to practice to see how, IMHO, C++ is such a flawed but beautiful language at the same time !
free without size is a good thing - consider it with size and you have yet another sort of UB when deallocation size is not equal allocation - do you want more UBs in C/C++? don't think so
40:41 stack allocation and the rest of the time you just do it on heap because it will live long enough that deallocating it shouldn't impact the hot path, we have multi-core CPUs, right, then you can use GC for the heap without the big performance hit, right ? Sure a small "%" of one of the tens of cores could be used by the GC, right ? You're almost there, so close to what an actual GC really does when it knows to migrate types to the stack, all of them does it. A freed-list with compaction, that's what a GC does, and it has cache affinity because the tree can be kept separated from the data-structure. And the you introduce generations, besides buckets for allocation sizes. congratulations, You're so close to getting the semantics of a GC.
45:29 yeah it is a powerful pattern, almost like a "garbage collector" and the theory behind them were based on solid computing science foundation and experimental data... This manual memory management thing is funny business. One day it will be regarded the same as coding in assembly was regarded when FORTRAN was invented. I'm writing this here for future people coming from 2035. Yeah, we were right, its impossible to do manual memory management of 512TB of RAM for a single process. Besides I hate having to remember to free stuff, I'm scientist, not an inventory manager clerk.
A GC will allocate on the heap if a certain allocation is to big, which is not what the stack allocator does. The stack allocator will *only* allocate on the heap if you've exhausted all its available memory. With a GC, allocating 496 strings of size 33 will most likely do 496 heap allocations, and since heap allocations are usually aligned on 8 bytes, you'll have 3472 bytes wasted. With a stack allocator, that will be no heap allocations and no bytes wasted.
My opinion is that I don't like this talk. Because he takes way to long to make his point and share his knowledge what he recommends programmers to do and why that is.
Odd, I would think the vast majority of people that watched this tend to disagree with you. I found this a very nice talk, it was interesting, engaging and rather useful. I spend all day everyday watching talks on coding and they are generally just so drab most of the world would lose interest in five minutes. This fellow seems to make things rather palatable.
std::alligator(); //probably useful
It was as if he was dodging a sniper. Agreed. The camera operator nailed it!
This guy is hilarious! :D A joy to watch and learn from.
In 1983, the C programming language did *not* have a void* type. We used char* and at least some of us used a typedef to indicate I really had unspecified stuff (I used 'ambi'). The void and void* types, along with features like allowing different structures to have members of the same name, were added later.
Did you have to return int for functions with no return value?
@@joshodom9046 Yes: en.wikipedia.org/wiki/C_(programming_language)#K&R_C
@@joshodom9046 Calling exit() or abort () was possible. You could just ignore return values, they were by default the last statements expression value and the default function type was int, like register.
I cannot remember declaring a void function ever catching a bug in practice. The main issue I remember was main() returning random error statuses in unfinished programs rather than success.
and that allowed for the language to become an untyped mess, type punning is EVIL
I read two of his books and I had no idea he was hilarious in front of an audience. Frankly I expected the opposite!
Honestly found Modern C++ Design funny. It entertained while dumping a huge amount of knowledge (Honestly still don't grasp everything 100%)
@@kidrigger Would you say that Moden c++ design is outdated for post 2011 C++? I saw it was released in 2001. Do you think a modern C++ programmer can still benefit from it?
@@andreashadjiantonis2596 Haven’t read the book but given that constexpr and consteval now exist I suspect that much of those magic templares are no longer needed.
So, such allocator, as described by Andrei, does it exist? Or it's yet just an idea for future C++?
This was one of my favorite talks, and I've seen a lot! He is very informative and at the same time pretty funny. And he is also the first person who successfully tought me that *int is a type*. The brainfk was huge when I finally realized it. xD
Best man!
What did you think `int` was before?
Cokémon Serious question: what did you think “int” was? It would help me understand what beginners may struggle with.
Thank you very much Dr Andrei Alexandrescu
28:20 "not having to be intelligent is a good sign" in design
Love it, what I always say: "I'm average smart, that is why I produce simple things, and I get pissed when smart people produce unnecessarily complicated things".
1:12 - you don't need to keep the file and line number, it is enough to keep the return address (__builtin_return_address on gcc), you can then get the source/line number from debug information (addr2line).
This is great! I'm a huge fan of Andrei. He gave a similar talk about allocators in D. I was hoping for a C++ treatment of the same topic.
What's with the comments about no profilers on OS X? I don't know why he would think that. Does Instruments not fit the bill?
operator new is a function essentially the same as malloc. No type information. It does not call constructors. You mean to refer to a "new expression" which uses the keyword and has complex behavior implemented by the compiler, which includes calling operator new, checking the result, and calling the constructors.
+John Długosz
Isn't that like saying operator+ is a function essentially the same as nop, and an "addition expression" uses the + operator and has complex behaviour implemented by the compiler?
+Nameguy not at all. The addition expression invokes operator+ and that's all, other than knowing precedence. a "new expression" calls operator new, *and* calls the constructor, *and* has a try/catch which calls operator delete. call operator new directly and you'll see it is a small part of what writing 'new C (args)' does. it doesn't even have the same return type. how is that remotely anything like the case of + ?
+Nameguy Please google "Difference between 'new operator' and 'operator new'", there was a nice thread on stackoverflow about the difference between these two things.
it can be overloaded per type, and the delete can take size. this is useful
How does one use these in any meaningful way, without reinventing the stl? As in, passing these allocators to std::vector won't work.
Andrei makes it sound like goodSize is useful for wrapping the argument to allocate, but why not just have allocators return the good size (the actually allocated size, and not the requested size) in Blk, which callers can read to find out how much they really got? Is there any other use for goodSize? Any reason to know goodSize without calling allocate?
Probably the benefit is that you can find the goodSize at compile-time, rather than run-time. In many cases, your solution would probably work though.
Upon further investigation, there is a serious drawback to returning the actual allocated size in Blk: it makes using the allocator harder. Consider if I wanted to allocate memory for a single int:
int *p = allocator.allocate(sizeof(int)).ptr;
allocator.deallocate({ p, sizeof(int) });
By always returning the originally requested size in Blk, you make it so that we no longer have to carry around the information about the size of the allocated block, since we know it statically to be sizeof(int).
It is because of composability. You might have an allocator that introspect various allocators (at runtime) and determines which one is the best depending what caller wants. Compiler can inline also any calls to goodSize, and in some cases optimize the entire determination to simple check or arithmetic. Also returning something else in the Blk that what was asked for can make things hairy in deallocate.
This talk is mad interesting, yo.
It looks to me like the freelist allocator has the fallback allocator baked in.
I would think the whole point of designing for composition is that you can separate those two parts.
The simpler free list allocator then becomes:
allocate(size) { if (size is wrong or free list is empty) return null; else return head of free list }
owns(p) { p.size == mysize }
deallocate(p) { assert(p.size == mysize); prepend p onto the free list; }
When combined with a fallback allocator, you should get Andrescu's free list allocator.
Finally someone shows the elephant in the room, those deallocators aren't free, people seem to think GC is bad and expensive and they never take care to notice the freaking linked-list heavy infrastructure that's consuming time after their method ran.
@@monad_tcp It seems that when you deallocate you potentially add a pointer to a list which is O(1) and when you allocate and remove something from the free list that is also O(1). The real cost is that the memory may be cold so you loading the data cache to do allocation or deallocation. This is a bad thing but, as pointed out in the talk, this is still a high performance memory strategy if you use in some local setting where the memory used is all hot.
Isn't this because it makes no sense to make a freelist allocator the second part of the composer -- you'd always just use the first part. So by baking the parent in, you enforce it to always be the first part of the composed allocator.
Using the generic fallback for Freelist feels more fragile - especially returning `owns` true for allocations you didn't make, just because you know that the Fallback code will then call you for deallocation. It's more code reuse, but not really more 'composition'. And as @@mmaxio says you couldn't use it any other way, you aren't adding flexibility by separating it out
Oh yeah, what if you do `Fallback`? The Freelist will claim to own B's memory too
51:33 this line contains a bug:
return b.length == s || parent_.owns(b);
In a FallbackAllocator, the FreeList will falsely claim to own a block that really belongs to X, if the size happens to match.
I think it should just be:
return parent_.owns(b);
Yeah, I think the ownership model is interesting, but as implemented in the examples, it certainly was not well formalized and could easily lead to improper compositions.
That's because if you try to allocate something with size *s* then it's the FreeList allocator that's going to take precedence and allocate the object. The X allocator will never get the chance of allocating anything of size *s* and therefore never own anything of size *s.*
On slide 29 the `owns` should have && instead of ||, no? The size needs to be the same and it needs to come from the parent allocator. Also, this assumes that no two FreeList for the same size will be created with the same parent allocator.
Good point! Given that the freelist allocator has a fallback path for size!=s, the b.length==s check should be deleted altogether.
This is because of first check is most likely cheaper than delegating to parent_.owns() function. So it is an optimization in case the parent_.owns() is expensive. Realistic FreelistAllocator will have a bit more complexity in all these functions actually. Take a look for example at this: github.com/dlang/phobos/blob/master/std/experimental/allocator/building_blocks/free_list.d#L705 , adding support for batch allocation makes it even more interesting.
I had to implement such allocators in C for a "standard" library for embedded systems in my company and man I wish I had C++ templates for that. I had to use macro-based templates instead which I wish on no enemy.
Macro templates are simpler and better IMO, what did you find so hard ?
@@anant6778this is a terrible opinion
'allocators should only care about size, alignment' ... but the type could give hints as to useage, or lifetime (as defaults, and you could override that with explicit hints)
Allocators seems like the wrong level of abstraction to be concerned with type, a level up from that, e.g. a container that uses an allocator, is where I might care about types. Useage hints sounds like a good idea though, like 'try to allocate this close to where I allocated this other data' or 'make this X byte aliigned', but the type itself seems like a poor metric to make these kinds of decisions.
The things he describes do not integrate well with the current STL implementation. In fact, that is one of the keystone statements. The next iteration of the STL (aka. STL2) is currently being designed which will be a complete revamp of everything. The STL has many historic garbage in it that are rooted too deeply into the entire system. Allocators for one is one such entity. You cannot change its API without having to touch just about everything.
STL2 idea is not developed anymore.
just found the hole in my argumentation: you don't get size_t as an input in free() in an std::allocator, do you? so you'd have to keep track of that seperately. This sucks >.
allocators: a memory allocator that gives me memory has no state
+Quentin UK I don't understand what you mean by this comment. Can you explain?
+John McConnell "its like the empty bottle that you keep on drinking water from" th-cam.com/video/LIb3L4vKZ7U/w-d-xo.htmlm50s
+Quentin UK ahh cool right right
There are errors in slide 28. It should be struct Node { Node* next; } *root_; And later on root_ = root_->next;
All those compositions incur extra overhead as all the checking as to which allocator owns a pointer or what allocator to use happen at runtime. This kind of stuff should be done during compile time , but that makes it impossible to allocate arrays whose length is only known at runtime.
These are templates, it will all inline into extremely efficient code actually. The performance will be as good, or actually better, than writing similar allocator by hand. Guaranteed.
the type information of the pointer being (de-)allocated should be enough to accelerate the look up. In case of segregate-allocator for example a binary search using the size of the type can be used as mentioned in the talk.
You'd have to write wrapper code to "unpack" the struct Blk into size_t and void*. Maybe some day std::allocator will be gone and something like this will be added to the standard, but until then, you'd have to write a wrapper.
You could also write a std::allocator that does only one thing: pass stuff on to an instance of YourComposableAllocatorType (YCAT). You actually use the YCAT instance for your code, but "dumb" code like std::something can use the std::allocator as a shim.
am here because I wanted to implement my own std containers (fake std), and didn't knew how vector managed to realloc because new/delete operators can't be combined with realloc/calloc/malloc/free operations. Now I know a byte more about allocators-
On the slide on minute 10, I don't understand what is meant by "no communication with the constructor". A call to new with a user-defined type calls the appropriate constructor of that type. Does anybody understand what Andrei wants to convey with this point?
when using the new expression, the constructor call is part of the syntax (the parenthesis) so the allocator itself has no idea of what constructor you want to call after his job
This was great high level info but I think I missed something crucial/obvious. How do I actually allocate the memory without malloc or new? Those functions don't look like they do much to me.
You'll ultimately have to rely on malloc, new, or perhaps some fixed size array (well, if you're staying within the confines of the standard). The point was more about how the allocator interface should have been designed to easily compose and customize their behaviour. But the first building blocks of your allocator will be things like Alexandrescu's Mallocator that merely call malloc/free (or something calling new/delete instead, or perhaps something that provides memory from a fixed-sized automatically allocated std::array, ... c.f. 35:46)
In fact, Andrei Alexandrescu gives an example at 1:08:49 : ultimately, A, Bucketizer, or FreeList all rely on Mallocator to do the actual allocation.
If you want to allocate memory dynamically without calling malloc or new, you'll certainly be using OS-specific system calls that obviously won't be defined in the C++ standard (e.g. for linux, things like brk, sbrk or even mmap). But that wasn't really the point of the talk (although you could encapsulate your OS-specific allocator as described in that talk).
there are OS-specific APIs for different OSes I believe. I am not very experienced, but I have certainly seen a few syscalls and stuffs, and malloc() does actually do decent amount of things under the hood
VirtualAlloc on windows, mmap on macos/linux. You just ask the operating system for memory however it exposes that functionality! You're running on an operating system after all!
All of them use malloc/new.
If you google StackAllocator or the Free List allocator for example you'll notice they all rely on getting a block of memory first from malloc/whatever platform API.
For the Stack Allocator example, when he 'deallocates' the memory, you just move a pointer in an already allocated chunk, and all of that chunk still stays allocated. Other processes can't use it still. But it's 'deallocated' in a different sense in regards to the StackAllocator's state.
30:37 Wouldn't it be easiest to just have the allocator put in the block "which" allocator it came from. The Blk could carry that info too.
That's got to be the only good way. His talk left me wondering how an allocators could implement owns() without all the same problems of keeping size information.
The issue there is how would you store the fallback allocator? You should be able to call the templated allocator yourself instead of going via the fallback allocator.
As he mentioned, allocator have to store state for good allocation, so owns is the logical choice.
This is interesting and all, but how do I actually use these? Without reinventing STL, I mean.
You can but it's probably stuff that should go in the STL, if they do.
Can I avoid manually doing alignment if I take type information in the template argument?
The allocator will have to handle alignment one way or another, taking it in as a type parameter or as a direct and explicit alignment paremeter doesn't change anything in that regard. There are no benefits from making your allocator aware of a type (There are cases where you *dont* care about it, like when allocating a chunk of memory that is going to be used by another allocator)
Doesn't the Freelist allocator violate the strict aliasing rules if you use the returned pointer as something other than a Freelist::Node?
std::allocator(s) is the hell in the paradise of low programming world
It's the hell in the hell of low programming world.
Impressive, great talk!
Is it advisable to have all these template instantiations of the same code that only differs in template parameters that are integers? Won't your code get bloated?
it would definitely result in larger binaries, but the code itself would be faster (depending on the compiler of course). Modern compilers should also optimize away unreachable code paths.
It is a good exercise for a brain, but also it has so many misleading points.
First of all, why free function work this way? Because you can't trust user. It can send you correct pointer and wrong size (or vice versa, or both wrong), consequences of this are unpredictable and probably will lead to really strange bugs in allocator.
Also free is not taking pointer to the beginning of allocated block, any address within this block is accepted. Without the latter you simply could not deallocate by pointer to superclass of an object (unless it is the first superclass), or does the speaker suggest to use dynamic type lookup to deallocate?
Modern malloc do many of the mentioned optimizations. Why would you need to reinvent it in your code?
I cannot imagine situation when you really need such an universal allocator. Is it going to be a shared allocator for certain objects? Well, what about multithreading then? Adding mutex there will make this allocator slower than direct call to malloc.
Usually what you neeed is allocator of the fixed size. It can be done without these composition tricks which are not free.
It wouldn't lead to strange bugs, you'd just call std::terminate since the program has entered a corrupted state.
The problem with "modern malloc" and many of them implementing various optimizations. You have no idea how it works internally, can be platform, compiler and version dependent. These composition tricks do pay off actually. The library Andrei wrote implements everything that is in jemalloc (high performance allocator with good scalability and low fragmentation), in 1/5 of the lines of code (no joke), and it can be easily tuned or adapted on the fly, for specific parts of code, or for some threads, or whatever.
"Its a pain in the peripherals" -great saying I'm probably gonna use that.
Actually looking at the what it would take to implement something, and it seems like there is a crucial part missing. Pointer comparisons are not well-formed expressions in C++, unless the pointers come from the same object or array. That some of the owns operations use this comparison does not seem at first look to be a path down cross-platform or compiler use. Perhaps something like the StackAllocator may work frequently, since the stack addresses are more well ordered, but even then it is not guaranteed it seems. For more complex allocators, the owns functions or any comparisons may not work too well.
Anybody have any thoughts on this?
Hey Matias, I’m not sure what you mean by “Pointer comparisons are not well-formed expressions in C++”, it is valid in C++ to test if two pointers are equal, this is done regularly for container objects like sets or maps, where you often want the container to hold pointers as their value type. In general, it can be very convenient to use pointers to represent object identity.
@@maryammurphy2599 It is valid in C++ to test if two pointers are equal, but it's not valid to test whether a pointer is bigger or smaller than another pointer **unless** they come from the same allocation. I've been stuck at this as well and would be happy if anyone shed any light onto this and came with suggestion how to implement the allocators that currently rely on this.
@@naxaes7889 std::less and company are defined as providing a total order amongst pointers.
@@shroudednight I remember going into the rabbit hole on this a while back and std::less still has some caveats. For example it's just defined to provide *some* total ordering, but on segmented memory architectures, it's not a guarantee this will be what you expect. Also when comparing pointers, the compiler is free to assume they came from the same memory buffer, which with the owns API may not be the case so some care has to be taken to write this in a way that casts the pointers to uintptr_t. Basically, each allocator is nice separately, but I'm still not sure owns is the right semantics to allow for full composition of the types.
For example how would a Fallback implement deallocation? If a pointer came from the FreeList, when the time came to delete, how would Segregated implement owns properly to ensure that all pointers get deleted by the allocator they came from? The FreeList should say it owns blocks of size S, but those may have come from Malloc actually in the backing stack for the FreeList expires. Instead it would be better if the allocation had some byte metadata to track if the primary or fallback was used and then segregated should already know which one was used based on the size.
great talk : )
Good talk; maybe write a book about this.
Can anyone answer me a question? I'm not very experienced in c++ and this talk left me a doubt...
Say that I implement all of these allocators...however I could have a class that contains an std::list...this will be probably allocated with mallocs, how are these allocator going to help me with that?
+KapoGames In fact, this talk is more about building utility allocators than performant allocators. If you're interested in optimizing some data structures like lists, you could have a look at the memory pool allocator pattern, or the small object allocator from the loki library (loki-lib.sourceforge.net/), the support library for the "Modern C++ design book", which is a great book btw. You also need to know what traits an allocator should match to be used with stl containers, you can have a look at this page : en.cppreference.com/w/cpp/memory/allocator_traits
Then you can apply these allocators to the stl containers like std::list, like this for example :
std::list myList;
The list will no longer use the default allocation strategy, but the allocation strategy of your passed allocator type.
Please also note that some containers already are really fast with the default malloc, due to the way they works, like std::vector. As for std::list, I remember implementing a memory pool allocator and improving insertion and deletion performances, because of the elimination of dynamic allocations.
Hope that helps !
Thanks! My problem was actually much deeper, I didn't get the idea that an allocator should just provide the memory, and I didn;'t know how to use smart pointers to call the deallocation on the allocators. However in the last days I managed to understand this, refactor all my project, and build 3 different allocators! Now your answer is interesting, because three days ago I would have not undertand it, now it brings me to the next steps :)
One more question...I was already thinking of buying modern c++ desing...would you recommend it in 2015, or there is some new book with similar content and newer stuff?
I may not be the best to anwser this question actually, as I'm not a C++ expert either :) But I still find "Modern C++ design" interesting today, because it gives a lot of highly reusable building blocks and technics. In fact, I might be affraid that today's litterature on real modern C++, aka C++14 and beyond, is still quite limited. The reason is that the language is evolving really fast these days, and lot of things are invalidated very quickly, so writing a book about modern C++ features may actually be quite a loss of time until things stabilize a bit (when we'll get everything planned for C++17 I think).
But yes, there's ONE thing you might want to read if you want almost not arguably good C++ practices, which is, you might know, the C++ core guidelines :
github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md
This is not really meant to be read btw, but trust me, there's so much interesting thing in this, you might want to consider reading some rules each day, and applying them after you understood why they're here. Also, the GSL (guideline support library) provides quite useful (but right now limited) features.
Then, I think C++ is really all about reading (like on experts blogs, or extremly detailed answers on stackoverflow) and practice, and some talks here and them is always good, but you really need to practice to see how, IMHO, C++ is such a flawed but beautiful language at the same time !
Perfect, thanks!
My pleasure. If you need some more info, do not hesitate to ask more questions !
Creator of DLang
Insane
Andrei rox!! :)
eu queria e compriende o que voce falar mas nao me ajudou
free without size is a good thing - consider it with size and you have yet another sort of UB when deallocation size is not equal allocation - do you want more UBs in C/C++? don't think so
40:41
stack allocation and the rest of the time you just do it on heap because it will live long enough that deallocating it shouldn't impact the hot path, we have multi-core CPUs, right, then you can use GC for the heap without the big performance hit, right ?
Sure a small "%" of one of the tens of cores could be used by the GC, right ?
You're almost there, so close to what an actual GC really does when it knows to migrate types to the stack, all of them does it.
A freed-list with compaction, that's what a GC does, and it has cache affinity because the tree can be kept separated from the data-structure.
And the you introduce generations, besides buckets for allocation sizes.
congratulations, You're so close to getting the semantics of a GC.
45:29 yeah it is a powerful pattern, almost like a "garbage collector" and the theory behind them were based on solid computing science foundation and experimental data...
This manual memory management thing is funny business. One day it will be regarded the same as coding in assembly was regarded when FORTRAN was invented.
I'm writing this here for future people coming from 2035. Yeah, we were right, its impossible to do manual memory management of 512TB of RAM for a single process.
Besides I hate having to remember to free stuff, I'm scientist, not an inventory manager clerk.
A GC will allocate on the heap if a certain allocation is to big, which is not what the stack allocator does. The stack allocator will *only* allocate on the heap if you've exhausted all its available memory. With a GC, allocating 496 strings of size 33 will most likely do 496 heap allocations, and since heap allocations are usually aligned on 8 bytes, you'll have 3472 bytes wasted. With a stack allocator, that will be no heap allocations and no bytes wasted.
My opinion is that I don't like this talk. Because he takes way to long to make his point and share his knowledge what he recommends programmers to do and why that is.
I watched his other talks and information density was lower than I'd like.
Odd, I would think the vast majority of people that watched this tend to disagree with you. I found this a very nice talk, it was interesting, engaging and rather useful. I spend all day everyday watching talks on coding and they are generally just so drab most of the world would lose interest in five minutes. This fellow seems to make things rather palatable.