This might just be the best video covering ECS I've seen. Just this week I began attempting to implement this on the GBA as the data driven nature of it really aligns well with that sort of limited environment
So, are all archetypes then a form of an entity essentially? Like all archetypes represent an entity, but not all entities belong to one archetype. I'm struggling a bit with the concept still but I'm having trouble as to why. So setting aside archetypes for a moment, my implementation was as described in the video first where all entities have memory allocated for every possible field, but in addition to an id my Entity struct has a component bit mask, with a single bit corresponding to a single potential component. This enables very fast iterating through only the specific components tied to an entity, I could even define these bitmasks in a similar manner to Archetypes, though I'm still using empty space essentially. Is the purpose of archetypes more for saving the usage of redundant space? I'm starting with the bitmask approach as I won't need all the additional memory just yet, but the memory overhead once the number of components and entities increase seems it could become untenable rather quickly.
Hey! Thanks, glad you enjoyed it! Sorry, this response got so long! 1. To answer your first question: I'd say that in an archetype-based ECS, all entities fall into exactly one archetype. You could think of an archetype as just the "pattern" that an entity is. Where the pattern is just describing all the components that an entity posesses. 2. The bitmask approach is definitely viable. I guess how viable it is depends on what type of entities your game will have. The only real downsides I can think of are: 2.a) "wastes" a bit of memory, which may also impact cache performance. This isn't really a problem until it is a problem 2.b) In the same vein of cache performance, its *usually* better to have a "struct of arrays" (SoA) rather than an "array of structs" (AoS). The reason is that if you need to just loop over one specific component. Lets say you want to increment the number of ticks every entity has been alive. Your cache utilization will be much more efficient if you just loop through an array of ints: "[]int", and will be less efficient if you have to loop through your array of structs "[]MyMegaStruct". This is because when data gets pulled into cache, the entire bytes get pulled from memory as well. So when you read your first int, the next 8 or so ints will be prefetched for you and will be in L1. 2.c) The biggest improvement I see from archetypes is one of "pre-culling" (I'm not sure what to call it, so I'll call it "pre-culling" here). In the video I link below, he calls this general concept pulling data "out-of band". But basically, because the archetype system automatically pre-filters entities by what components they possess, you dont have to do any internal checks to see if the entity you are working on is one that has the data you need. As an example, let's say you have 1M entities, but only 10 of them have a "Health" component, and you want to check if anything has died this frame. With the MegaStruct approach, you have to loop all 1M entities and check if they have Health component. Whereas in the archetype approach, you just have to loop through the archetypes which have a health component, which lets you easily discard the vast majority of your entities. All of these downsides I would say are more of "scalability" problems. if you don't have a lot of things to do work on, you'll probably see very minimal benefits from improving things like cache access rate. On the flip side, if you have a ton of entities with a lot of different archetype patterns, if your access of these components isn't efficient it's kind of "a death by 1000 cuts" I watched this good video yesterday you actually could check out if you want more intuition about data oriented design: th-cam.com/video/WwkuAqObplU/w-d-xo.html Hope that helps!
Great video, I was reading several articles and could NOT wrap my head around what the archetype is supposed to be. Your video was very easy to understand and much appreciated!
@@techpriest4787 It's for cache optimization, basically you only request the data that you will be operating on, this makes sure only that what goes into CPU cache is things you need, minimizing the need to access RAM (which is time consuming comparative to cache)
@aghayejalebian7364 ah. So at this point you would also use pointers instead of an index to avoid accessing the entire vector of enties. It is not final storage but a per system filter/isolation/window/scope type. I guess now I understand why there is the word type in archetype. The new "type" tells the caching to keep things apart. Seems this only would work well with systems schedulers. Because their parameter signature determines such archetypes and produces the types via meta programming. At least it seems that this is how Bevy does it. Also seems that my Prefabs Nodes Systems kernel would do somewhat similar if I grouped the prefabs into their own vector inside the main vector. Such jagged array(I am using Rust that is why I use the word vector instead array) also should to isolate by type. Though it only would do that on a per prefab basis and not per system execution. I wonder how much difference it would make. Ditching the rest of the vector seems to be a massing gain already.
nice. I tried writing an ecs in typescript a year ago but gave up after being stuck in the weeds debugging type issues. Now I'm redoing the project in python and this video has been very helpful as you're extremely concise compared to other resources
Thanks! Your medium articles are incredibly good and gave me a lot of solutions to many of my ECS problems. Glad to have had you watch and like my video!
My custom ECS in Rust uses sparse arrays instead, and I keep them sorted (by entity index). This means running the ECS system is as fast as it can be (you're literally going over a flat array that stores all data densely). The trade off is that getting a component by entity index needs to use binary search to get to the location. Archetypes incur the cost at adding/deleting components instead of fetching them. It really depends on the use case whether one or the other is better. I really like to just add and delete components a bunch for a dynamic gameplay loop where you can alter state by adding or removing components. For the archetype pattern that's horrible. Yet if you like to fetch components by entity index a lot in your code the binary search my implementation has to do can become a bottleneck. It's really fun to create an ECS but it's also full of very benign seeming nuances that can have a huge impact for hot loops.
Oh this comment helped make Archetypes click better for me, by posing the question of where exactly the cost lies in the approach. I'm working on a c library for gameboy advance (it's not great but I'm learning lol), and it's a delicate balance between memory and processing power, so I'm having the feeling Archetypes might be more trouble than worth here.
@@graydhd8688 that sounds like a really fun project! The problem is how big and fast CPU cachelines are though, and how many it has. If the CPU is great at prefecthing but you are fetching more components than you have cachelines in an ECS system query, my approach will have to clear (invalidate) the cache multiple times for a single entity lookup, because you'd be pulling from multiple memory locations (component storage arrays). In that case archetypes are way better since it can prefetch exactly as efficiently as the hardware allows. The only way to find out is benchmarking it for both but that's way too much work for a single dev wanting to finish a product in a decent time, so maybe not worth it for you. I'd just try the approach I outlined above and add that I was an idiot doing binary searches for fetching. You need to binary search to keep the arrays sorted when you insert components, but for fetching you should just have a hashmap with the entity index as a key and the actual array location in the component array as a value. No need to binary search that way, but maybe on your hardware that's too much memory. Good luck with your project!
@@stysner4580 the gba doesn't have anything so luxurious as a cache unfortunately lol, my options are 32 kb internal work ram for data needing quickest access, and 256 kb external ram for the rest (not including the rom/cartridge). I plan on doing more than one implementation so I can stress test and compare approaches, I'll have to add up the size of data used across all components for a single entity and figure a reasonable range of entities and number of total components I could manage in both faster and slower memory
@@graydhd8688 Ah I have no idea about the GBA architecture as I couldn't really place what hardware it uses, but if it's anything like an N64 or similar, you are probably limited by the memory speed 99% of the time, which is true for modern desktops as well, but you don't have the luxury of caching everything for perf because you simply don't have enough RAM, meaning you're likely to end up reading from a disk. It again depends on the game/application but I'd say that given this limitation you'd be better off being conservative changing the actual component data and leave the cost at instantiation rather then runtime. You probably can't afford to trade some performance to be actively adding and removing components like what I set out to do. I'm targeting modern PCs only.
Very cool! The Archetype pattern loosely reminds me of how the v8 Javascript engine stores Prototypes, the internal object schema for all objects in a Javascript runtime.
Yeah. That's definitely a viable approach. The only downside (besides cache coherency) I would see is that if you need to filter your 20k entities for the 5 that contain a certain set of components, you would still have to loop over all 20k entities to see what components they have. But that might not be a huge issue or it might be mitigatable depending on the app. Thanks for watching!
In that scenario, your 'for' loop that you must execute must check every single entity in the game, rather than only having to check every single entity that contains the health component. If your building a simple game without lots of entities, then this probably wont impact performance very much. But, laying out the memory in the ECS way that I describe gives you pre-filtered arrays that you can loop over. Rather than manually filtering at loop execution time. Hope that helps!
In your first little example, it would make more sense to have position but not velocity. I dont see why something would have velocity and no position.
Instead of searching for entities in an archetype, I just constantly update each archetype's list of entities each time a change in components happen on an event model, which is much faster. So each archetype knows about all the eligible entities at any given moment.
Yeah sounds smart. I do the same thing for individual entity lookups. I have one table for id -> archetype. then inside of an archetype I have another to look up the index
The := operator in Go is just used to create a variable and assign the values to it. In this case I'm just creating a pos and vel variable in a kind of "for each position and velocity in the world" sort of way
Yeah. I think Go has great concurrency. Because the data is laid out really nicely. All I have to do is section of blocks of data to be processed concurrently. ie [0, 1023], [1024, 2047] ... etc. I probably wouldn't pass underlying data around via channels. but I might pass things like slice headers that tell worker threads which section they are responsible for processing.
But wait ... why? The key point of ECS is memory locality isn't it? When I have used C++ ECS architectures entities will tend to be structs which are all the same shape and can be put into a single vector and processed sequentially by a system. This allows the full benefits of CPU caching making the game faster with less misdirection when looking up data. However the approach here seems to have entities which are irregularly shaped meaning that you don't get those benefits. You do it just for code organisation? Or am I missing something?
I think there is some confusion. An entity is a collection of components (or an ID that points to a collection to components). So ECS libraries need some way to store that collection of components for the entity ID. A cache-friendly way to store components is to store the same "shape" all in an array and loop over that. The problem that arises is if you have entities with *different* sets of components. And you end up in a situation where not all entities *have* all components; so: you either bite the bullet and have "holes" in your arrays, or you do some sparse set magic (like what EnTT does), or you use archetypes (like Flecs does, and what I decided to do). IIRC I think Bevy lets you pick at runtime what sort of component storage you use (sparse sets, or archetypes). Depending on your data access patterns, different component storages would give you better performance. So I think the "irregular shape" you're mentioning is that of the entity, not the component. The components are still packed tightly, just like the cache prefers. Hope that helps!
Interesting idea. But man that's a lot of overhead just for a few doubles and so many iterations to get to the actual data. I'd just make an array of doubles, keep track of each entity/component start index and length in the array and then for integration send that entire array to the integrator (s += dt*ds). And if there are colliding objects I'd not give each object access to each other object in the world, I'd pre-cluster them so that each one checks a handful but not millions of entities.
Appreciate the comment. I think it feels like a lot of overhead when laid out in the video, but the core of the loop is still what you described (ie pass an array to an integrator or system function). The overhead is only needed when fetching that array (which would be once per tick). Incidentally, some of that overhead is still present when looking up one individual entity, but I'm looking into ways to reduce that if I can. For my cases, looking up individual entities is somewhat uncommon, but still - always a way to improve! The challenge with just making a single array of doubles and tracking a section of that for each entity is that it doesn't scale well (cache-wise and memory-wise) when the entity has a lot of components. It also causes a decoding problem of "How do I decode this chunk of floats/bytes into my entity struct" - or if you pack a bunch of structs in a big array you'll just waste space when entities have blank data. Obviously, these could both be reasonable approaches depending on your game/data layout. I don't think I had collision code in this particular video, but yeah, It's more optimal to pre-cluster your colliders either via spatial hash or KD trees. Hoping to look into that for the future! Thanks for watching!
@@UnitOfTimeYT Thanks for the reply. Well I guess it depends on how much data you actually have and what kind of different entities are needed, as you said.
Just wondering, why can't you just put all like components in their own arrays? Each component would just need a reference to affect its entity, and this would work fine right? This sounds simpler than making archetypes, but I'm curious why I'm probably wrong.
I might not understand your exact question, so let me know if my response doesn't answer it. You can think of one Entity as being a column created by pulling a component out of every array at the same index. So if we pull index 5 out of every component array, that gives us the full entity at index 5. The main problem that led me to archetypes is "What do you do when you have entities who don't have a specific component?". If I have one big array for all the health components, and but then index 5 is a rock with no health, I still have to put something in the index 5 health box. There's a couple ideas: 1. Make it a health pointer, then just null when its not there: This means all healths need to be pointers which means a lot of extra dereferencing which could make us slower 2. Add a bool to indicate that the entity doesn't have health: This is similar to health pointer, but wastes some space every time an entity doesn't have a health component. The biggest problem with the ideas 1 and 2 is that: If we have a system that draws healthbars. Let's say that we have 1M entities and 1k of them actually have the health component. If we have one big array, we have to loop over all 1M entities; first checking if they have the health component, and second drawing the heathbar. There's two solutions I know of to the "one big array" problem: 3. Archetypes like I discussed 4. Sparse sets - Which is where you still have one big array for all components, but you do some mapping magic to compress a sparse array to a dense array. This lets you just delete all of the holes Both archetypes and sparse sets come with their own tradeoffs. I personally preferred archetypes because it made more sense to me. Just as a little side-note, there's two reasons you might structure your game this way: 1. Organizationally, a gamedev it's nice to dynamically add and remove components from your entities, and then based on those components run different systems on them. 2. You have so many entities that organizing the memory becomes important Performance may not matter to you, and in that case you can definitely implement some more simpler ECS patterns if you want. Also organizationally, you might not need that much configurability (in fact, you may not even want it). So there's definitely reasons for and against using an ECS in general. Hope that helps!
@@UnitOfTimeYT Thanks for taking the time to explain, I appreciate it. I just realised that I was asking why we don't use the slow ECS, which you answered in the video already 0_0. I guess what didn't click at first is why the 'slow ECS' is slow, and I kinda understand now.
@@Shack263 All good! Happy to answer :) Yeah IMO at this level of performance optimization it becomes really situational on which memory organization will become faster, and it's not super obvious why until you really dig into it.
Hey thanks! For some of them (specifically the bubbles floating around) I pointed my opengl framebuffer to ffmpeg and encoded it to a video. For the graphs, I used the very popular and featured library, manim. And for block diagrams I usually use draw.io. Then I render codeblocks to images using this tool called marp. Now that I think of it I'm using a lot of different things lol
Sorry I thought this comment was on my Rust vs Go video. For this one it's mostly draw.io for diagrams and then marp for generating slides! Got the idea from the NoBoilerplate who does slide-based video presentations.
Very cool system I was playing with it as I have recently just started using Go. However, I have a question... If you were to create a system that says relies on more than 2 components, would you have to write a new Implementation of query2 for each query you would need? Or would code generation come in handy here? Sorry if this is a dumb question, not really an expert Go developer. Thanks for the video by the way. So cool to watch.
Hey thanks. Yeah unfortunately Go doesn't support variadic generic arguments, Go also doesn't support overloading function names. So If I want to iterate over 9 different components in my ECS, I have to write a query9 function (for example). This seems to be how the Go authors do it, when I've seen them write functions that do Map2() Map3(), etc. type functions. Luckily, I haven't had to iterate over more than 3 so far. I also think that if your components are so sparse to where you have to iterate over a lot of different arrays at the same time, then you might need to reorganize. But what you pointed out is definitely a very important callout and limitation of my ECS. Thanks for pointing that out!
Cache coherence might not be the correct term to use here as it refers to the differences between multiple caches that reference the same memory space, I think that it would have been better to use cache locality or cache miss rate for this example, either way the intention is understood from explanation and visuals.
Archetypes should not be intersecting sets. The set "position, velocity, sprite" already contains the set "position, velocity". You can read the former without accessing the sprite as if it was the latter set. This prevents you from keeping track of and updating more sets than you actually need. The sets "position, velocity" and "position, sprite" are not intersecting and would be the kinds of sets you actually want to keep track of in this way. Other than that, I really liked the video.
Yeah that could certainly be a way to reduce the number of archetypes you have. Practically, I'm not sure how you'd heuristically decide in a simple way which archetypes fall into others without ending up with one mega archetype that contains every possible component. In practice though, I do typically have very similar archetypes that are like "all of these components, but also this one" so there is maybe some potential for optimization there.
Hey, I'm currently working on implementing an ECS in rust. I decided to use a sparse-set based system instead of archetypes, but my question is around how you managed to multithread your ECS? Are there any resources you can point me towards, as hours of googling have given me really unspecific results that haven't been any help. Thanks.
Hey, Yeah I think there's two main ways you might multithread: 1. You could multithread different systems. You can think of your set of systems as a Directed Acyclic Graph (DAG), and then figure out the dependencies between each system. Then you can run two systems at the same time if they don't depend on eachother, and if they don't use the same underlying storage. 2. Inside of each system, if you have a large enough dataset, you can parallelize the execution of your function on the underlying dataset. I'd guess that this is much easier for archetype based ECS's. Mostly because in archetype ECS's you can just divide up indexes because each component array has the same order (with respect to entity). My understanding is that for sparse sets, different arrays can have different entity orders which makes it hard to divide up what work needs to be accomplished. I think if you check the description of this video there's two links for two famous ECS authors, you might check out their articles. They probably have more information about this ( I don't have specific references tho, sorry)
@@UnitOfTimeYT Hey, thanks for the reply. I did some research on implementing this in rust; it would certainly be difficult and would be macro heavy or have clunky syntax, beyond actually developing a concrete multi-threading implementation. It leads me to think; is multithreading actually an upside? For reference, the game I developed this ECS for is a Rogue-lite. Adopting a multithreaded system will mean that very few things can be truly parallel, as there must be a sequential order of events in terms of systems (physics comes before rendering), and because of the nature of a rogue-like bullet hell, many systems can affect other parts of systems. This makes the overhead of having a multithreaded system questionable, since spawning threads is not cheap and figuring out the dependencies of systems isn't either. In addition, for things that affect the world, you'd need to delay them to some sort of sync point. You can't have two systems concurrently spawn an entity with the position component unless everything is behind RW (Read-Write) Locks, which also makes the performance of such a system questionable. What have you found in your game? With this, I developed an ECS for developer ergonomics with performance as a convenient upside. I'm not sure multithreading would be good for either.
Yeah I think you're rightfully calling out a lot of the difficult challenges with multithreading. You might take a look at how bevy does it as well. I believe they use both of the approaches I mention in my previous comment. For approach 1, you'll really only get gains if your game is such that you have separate systems that don't really interact, then for approach 2 I think you're more likely to get gains if you have large sets of data that don't interact. In my own ECS library, I currently don't have much in the way of multithreading. Though I've intentionally left room for myself mostly in approach 2. Just from my game systems, I've observed that (other than the main draw calls, which aren't really multithreadable) there are a few systems that the majority of every frame time: collision detection and batching dynamic geometry. I think optimizing the systems which take the most time is probably your best bet for making your game run faster. Yeah, you'll definitely need sync points to add new entities back to your underlying storage. Bevy has "Commands" that they buffer inside a system and then execute at the conclusion of the system. I'm planning to eventually do something like that, but haven't really finalized it much past a basic prototype. I don't think the overhead of locking here will cause too much trouble, but I don't really have any data to confirm that, so I guess I'll find out as I do more profiling. Right now my game isn't really big enough to warrant much multithreading and I get most of my optimizations by just improving the rendering code. Most of my concurrency surrounds the IO part of my game: ie pulling in packets and decoding them is done concurrently. And then, most of my scalability problems (with things like rendering and physics) I've been able to solve by just using better algorithms. Hope it helps!
@@UnitOfTimeYT Yes, this has been really productive. I think the correct approach for me is to develop some sort of task/job system multi-threading model and have systems run sequentially with access to some sort of job resource which enables me to multithread on a case by case basis. For example, if I implement a procedural generation algorithm which has potential for multtihreading, I could do it specifically in that system without an overarching approach, and everything else would run sequentially. I also agree with you in the sentiment that my game is not big enough to warrant multithreading; the game I'm setting out to create will be 2.5d and performance will almost certainly not be an issue. I have also gandered at bevy's, specs, and legions source codes to varying degrees and have read about their multithreading implementations. The conclusion I've come to is that it is way too complicated for just me to tackle. Overall, after doing so much research, I'm just overall not sure multithreading for this specific project is a net gain. The benefits will be limited, and if needed, I could multithread on a case by case basis. Much of my development decisions have been driven by EnTT and the article you linked by its creator, and after looking at EnTT, it seems they adopt this sort of approach as well.
careful with the "add" and "remove". in game dev, there should be no adds or removes, if by that you imply memory reallocation. have fixed size arrays, do not reallocate. ever. predetermine how big the arrays must be, limit object instantiations, keep track of the arrays actual size (index pointing to the "end" within the big array). and simply juggle around data within it.
Yeah I'd generally agree that it's good to reduce and remove memory allocations. In my ECS there's an initial growth period as arrays get sized up to hold more entities, but at some point the array gets large enough for the application and entities are essentially "pooled" in terms of their memory.
Eh I disagree. For my framework, both for CPU/RAM and GPU/VRAM allocation I allocate in chunks. It means you don't have to pre-allocate all the data you might ever need and if you only re-allocate when you go 2 chunk sizes above or below the current size you also don't get the problem where removing a little bit and adding a little bit each frame re-allocates every time. Using "never re-allocate, ever" as a mantra just makes your life unnecessarily hard. If it turns out to be a problem later you can always optimize it then. It seems to me the "never re-allocate" is more a dangling pointer/reference hell concern than it is a performance one.
@@stysner4580 your game's resource requirements are non-deterministic then. its better to allocate 8gbs or so once on startup or map loading and be done with it, instead of having hiccups every now and then just because you pass a threshold and need to allocate new chunks or free some chunks...
@@tanko.reactions176 On modern hardware allocating another couple of megabytes isn't a huge issue at all though (of course not every frame). If you just freewheel it and you run into hiccups you can always implement some very simple object pooling and do your allocations there. Most games allocate a bunch of stuff at startup and the rest uses some data streaming for assets, but in that case you should have object pools or similar; to allocate all the memory you might need at application startup is a bit too oldschool for my liking. Memory arena's with chunks give you flexibility and performance. If you have different settings that might impact how much memory the app needs you'd have to have some way of calculating the capacity of memory needed for every combination of settings. Just seems like a hassle for no good reason.
Hi, just a hint by some none native english speaker: you speak way too fast and you seem to have removed the pauses between sentences. It's therefore hard to follow your explanations. I need to rewind every now and then which doesnt always leads to better understanding. Which is a pity since you seem to know what you're talking about.
I am not sure I get the point. It sounds like this is just trying to fix problems that are otherwise fixed by just writing code properly. Can you outline exactly why you would want to structure your code like this please?
Sure, this sort of technique is mostly used in game development, where you have relatively tight performance requirements and also often times need very flexible object creation (where you can dynamically add and remove components at will). This way of organizing your data, as a struct of arrays of components, will get you better cache performance than if you organized your data in the opposite direction (ie as an array of a struct of your components). Definitely not a fit for every software project. Hope that helps.
@@UnitOfTimeYT It does thanks but aside from the performance benefits which I think you could mostly get by not mangling your domain objects with a ton of needless inheritance and then using the appropriate data structures ... why is it worth essentially denaturing your codebase. It feels like a really horrible way to organise your code from a DX point of view. Whats it like to work with?
@@sacredgeometry You're stuck in OOP land. You never get the diamond pattern issue with ECS. You need to really shift your reasoning about data structures and architecture to understand why ECS has benefits. Refactoring in ECS is incredibly convenient, you can just define a new component and write a system that takes a certain set of components, it's completely separate and can be deleted to get back to where you were without any refactoring of existing classes or reimplementing certain templates/interfaces. Performance is always the first thing to be mentioned with ECS because it's a DOD pattern very popular in game development, but even for game development the greatest benefit in my eyes is the architecture of the codebase itself. You can mix and match to your hearts content; you can create any entity type you want and all your data and code is separated and modular.
@@stysner4580 Performance isnt a consideration until its a problem. Being able to rationalise your codebase is in most applications is far more important and I am not stuck in OOP land I am stuck in the "your problem is one of your own making" land as all the problems they are talking about have nothing to do with OOP and everything to do with how they have structured their code. Not only that but for the most part can still exist in ECS as well as literally any paradigm. Its literally just a problem of inherence vs composition and plenty of functional approaches have extensive and inefficient type hierarchies that would not only suffer this problem but make it worse because of the added nature of preferred immutability.
@@sacredgeometry I can tell you've never made a game by your performance statement. "Premature optimization is the root of all evil" is such a stupid quote; if it takes 25% more time to implement something you know will have to be optimized later anyway, you should just do it right away. Also, saying "OOP is not a problem if you structure your code right" is really, really weird when ECS is literally another way of structuring your code. Your second point is also pretty moot since I created a custom ECS in Rust which works just fine, even though Rust would be the worst language to use if your statement was correct. It really doesn't matter what language you create an ECS in, as long as there is some kind of type erasure and interior mutability pattern for a language you can implement an ECS just fine. I think you've never written a sizable project in a non-OOP language, correct? Because this is what you see for any functional language or any non-OOP language; people say there's so many problems with them because it's more complicated at the start of a project. But the real scalability powers of composition over inheritance and very strict type systems only shows when your project reaches a certain size. ECS isn't the solution to everything, but it is the kind of paradigm shift you have to have implemented in a sizable project to see the benefits.
This might just be the best video covering ECS I've seen. Just this week I began attempting to implement this on the GBA as the data driven nature of it really aligns well with that sort of limited environment
So, are all archetypes then a form of an entity essentially? Like all archetypes represent an entity, but not all entities belong to one archetype. I'm struggling a bit with the concept still but I'm having trouble as to why. So setting aside archetypes for a moment, my implementation was as described in the video first where all entities have memory allocated for every possible field, but in addition to an id my Entity struct has a component bit mask, with a single bit corresponding to a single potential component. This enables very fast iterating through only the specific components tied to an entity, I could even define these bitmasks in a similar manner to Archetypes, though I'm still using empty space essentially.
Is the purpose of archetypes more for saving the usage of redundant space? I'm starting with the bitmask approach as I won't need all the additional memory just yet, but the memory overhead once the number of components and entities increase seems it could become untenable rather quickly.
Hey! Thanks, glad you enjoyed it! Sorry, this response got so long!
1. To answer your first question: I'd say that in an archetype-based ECS, all entities fall into exactly one archetype. You could think of an archetype as just the "pattern" that an entity is. Where the pattern is just describing all the components that an entity posesses.
2. The bitmask approach is definitely viable. I guess how viable it is depends on what type of entities your game will have. The only real downsides I can think of are:
2.a) "wastes" a bit of memory, which may also impact cache performance. This isn't really a problem until it is a problem
2.b) In the same vein of cache performance, its *usually* better to have a "struct of arrays" (SoA) rather than an "array of structs" (AoS). The reason is that if you need to just loop over one specific component. Lets say you want to increment the number of ticks every entity has been alive. Your cache utilization will be much more efficient if you just loop through an array of ints: "[]int", and will be less efficient if you have to loop through your array of structs "[]MyMegaStruct". This is because when data gets pulled into cache, the entire bytes get pulled from memory as well. So when you read your first int, the next 8 or so ints will be prefetched for you and will be in L1.
2.c) The biggest improvement I see from archetypes is one of "pre-culling" (I'm not sure what to call it, so I'll call it "pre-culling" here). In the video I link below, he calls this general concept pulling data "out-of band". But basically, because the archetype system automatically pre-filters entities by what components they possess, you dont have to do any internal checks to see if the entity you are working on is one that has the data you need. As an example, let's say you have 1M entities, but only 10 of them have a "Health" component, and you want to check if anything has died this frame. With the MegaStruct approach, you have to loop all 1M entities and check if they have Health component. Whereas in the archetype approach, you just have to loop through the archetypes which have a health component, which lets you easily discard the vast majority of your entities.
All of these downsides I would say are more of "scalability" problems. if you don't have a lot of things to do work on, you'll probably see very minimal benefits from improving things like cache access rate. On the flip side, if you have a ton of entities with a lot of different archetype patterns, if your access of these components isn't efficient it's kind of "a death by 1000 cuts"
I watched this good video yesterday you actually could check out if you want more intuition about data oriented design: th-cam.com/video/WwkuAqObplU/w-d-xo.html
Hope that helps!
Great video, I was reading several articles and could NOT wrap my head around what the archetype is supposed to be. Your video was very easy to understand and much appreciated!
Glad it was helpful!
So what is it? Is it about avoiding downcasts and thread locks per data by putting them into one tuple?
@@techpriest4787 It's for cache optimization, basically you only request the data that you will be operating on, this makes sure only that what goes into CPU cache is things you need, minimizing the need to access RAM (which is time consuming comparative to cache)
@aghayejalebian7364 ah. So at this point you would also use pointers instead of an index to avoid accessing the entire vector of enties. It is not final storage but a per system filter/isolation/window/scope type. I guess now I understand why there is the word type in archetype. The new "type" tells the caching to keep things apart.
Seems this only would work well with systems schedulers. Because their parameter signature determines such archetypes and produces the types via meta programming. At least it seems that this is how Bevy does it.
Also seems that my Prefabs Nodes Systems kernel would do somewhat similar if I grouped the prefabs into their own vector inside the main vector. Such jagged array(I am using Rust that is why I use the word vector instead array) also should to isolate by type. Though it only would do that on a per prefab basis and not per system execution. I wonder how much difference it would make. Ditching the rest of the vector seems to be a massing gain already.
nice. I tried writing an ecs in typescript a year ago but gave up after being stuck in the weeds debugging type issues. Now I'm redoing the project in python and this video has been very helpful as you're extremely concise compared to other resources
Thanks! Glad that you found it helpful!
I'd been saying for years that Go wouldn't be viable for a working ECS until it had generics. And here we are.
Yeah I originally tried to build one by generating code but it was incredibly painful to work with. Glad to have generics around now.
Very nice explanation of what goes into an ECS. Thanks for the Flecs shoutout!
Thanks! Your medium articles are incredibly good and gave me a lot of solutions to many of my ECS problems. Glad to have had you watch and like my video!
Great video as always. Curious to see the next video. My current ECS imlementation uses a similar approach.
:) Yeah benchmarks are always fun to see!
Great content! Thank you for your time assembling such a great summary in such a consumable way! :D
Thanks! Glad you enjoyed it! :)
This is really cool, I don't know anything about Go, but the background knowledge of ECSs is really nice!
Thanks glad you liked it!
Underrated channel; also a nice listen. thank you
Thanks! Appreciate it!
great video and explanation Unit! 🙂
Thank you!
My custom ECS in Rust uses sparse arrays instead, and I keep them sorted (by entity index). This means running the ECS system is as fast as it can be (you're literally going over a flat array that stores all data densely). The trade off is that getting a component by entity index needs to use binary search to get to the location.
Archetypes incur the cost at adding/deleting components instead of fetching them. It really depends on the use case whether one or the other is better. I really like to just add and delete components a bunch for a dynamic gameplay loop where you can alter state by adding or removing components. For the archetype pattern that's horrible. Yet if you like to fetch components by entity index a lot in your code the binary search my implementation has to do can become a bottleneck.
It's really fun to create an ECS but it's also full of very benign seeming nuances that can have a huge impact for hot loops.
Oh this comment helped make Archetypes click better for me, by posing the question of where exactly the cost lies in the approach. I'm working on a c library for gameboy advance (it's not great but I'm learning lol), and it's a delicate balance between memory and processing power, so I'm having the feeling Archetypes might be more trouble than worth here.
@@graydhd8688 that sounds like a really fun project! The problem is how big and fast CPU cachelines are though, and how many it has. If the CPU is great at prefecthing but you are fetching more components than you have cachelines in an ECS system query, my approach will have to clear (invalidate) the cache multiple times for a single entity lookup, because you'd be pulling from multiple memory locations (component storage arrays). In that case archetypes are way better since it can prefetch exactly as efficiently as the hardware allows.
The only way to find out is benchmarking it for both but that's way too much work for a single dev wanting to finish a product in a decent time, so maybe not worth it for you. I'd just try the approach I outlined above and add that I was an idiot doing binary searches for fetching. You need to binary search to keep the arrays sorted when you insert components, but for fetching you should just have a hashmap with the entity index as a key and the actual array location in the component array as a value. No need to binary search that way, but maybe on your hardware that's too much memory.
Good luck with your project!
@@stysner4580 the gba doesn't have anything so luxurious as a cache unfortunately lol, my options are 32 kb internal work ram for data needing quickest access, and 256 kb external ram for the rest (not including the rom/cartridge). I plan on doing more than one implementation so I can stress test and compare approaches, I'll have to add up the size of data used across all components for a single entity and figure a reasonable range of entities and number of total components I could manage in both faster and slower memory
@@graydhd8688 Ah I have no idea about the GBA architecture as I couldn't really place what hardware it uses, but if it's anything like an N64 or similar, you are probably limited by the memory speed 99% of the time, which is true for modern desktops as well, but you don't have the luxury of caching everything for perf because you simply don't have enough RAM, meaning you're likely to end up reading from a disk.
It again depends on the game/application but I'd say that given this limitation you'd be better off being conservative changing the actual component data and leave the cost at instantiation rather then runtime. You probably can't afford to trade some performance to be actively adding and removing components like what I set out to do. I'm targeting modern PCs only.
honestly that chart at 1:15 instantly made it make sense to me holy crap
Very cool! The Archetype pattern loosely reminds me of how the v8 Javascript engine stores Prototypes, the internal object schema for all objects in a Javascript runtime.
Oh cool. I had no idea! Thanks for sharing!
the old name for archetypes was prototype!
I tried a lot of different ecs systems. What worked for me and is good enough for
Yeah. That's definitely a viable approach. The only downside (besides cache coherency) I would see is that if you need to filter your 20k entities for the 5 that contain a certain set of components, you would still have to loop over all 20k entities to see what components they have. But that might not be a huge issue or it might be mitigatable depending on the app. Thanks for watching!
@@UnitOfTimeYT It indeed really depends on the application.
4:25 Why cant we just loop over the health array directly and add a variable inside the Health component to check if it is used by the entity
In that scenario, your 'for' loop that you must execute must check every single entity in the game, rather than only having to check every single entity that contains the health component. If your building a simple game without lots of entities, then this probably wont impact performance very much. But, laying out the memory in the ECS way that I describe gives you pre-filtered arrays that you can loop over. Rather than manually filtering at loop execution time. Hope that helps!
Great lecture, thank you.
dude that thumbnail had me puzzled for a minute
Great video
In your first little example, it would make more sense to have position but not velocity. I dont see why something would have velocity and no position.
Instead of searching for entities in an archetype, I just constantly update each archetype's list of entities each time a change in components happen on an event model, which is much faster.
So each archetype knows about all the eligible entities at any given moment.
Yeah sounds smart. I do the same thing for individual entity lookups. I have one table for id -> archetype. then inside of an archetype I have another to look up the index
Like reinventing the wheel?
what does the := operator in 1:40 mean ?
The := operator in Go is just used to create a variable and assign the values to it. In this case I'm just creating a pos and vel variable in a kind of "for each position and velocity in the world" sort of way
Since you are working on Golang, I wonder if channels help in this case for concurrency/parallelism?
Yeah. I think Go has great concurrency. Because the data is laid out really nicely. All I have to do is section of blocks of data to be processed concurrently. ie [0, 1023], [1024, 2047] ... etc. I probably wouldn't pass underlying data around via channels. but I might pass things like slice headers that tell worker threads which section they are responsible for processing.
But wait ... why? The key point of ECS is memory locality isn't it? When I have used C++ ECS architectures entities will tend to be structs which are all the same shape and can be put into a single vector and processed sequentially by a system. This allows the full benefits of CPU caching making the game faster with less misdirection when looking up data. However the approach here seems to have entities which are irregularly shaped meaning that you don't get those benefits. You do it just for code organisation? Or am I missing something?
I think there is some confusion. An entity is a collection of components (or an ID that points to a collection to components). So ECS libraries need some way to store that collection of components for the entity ID. A cache-friendly way to store components is to store the same "shape" all in an array and loop over that. The problem that arises is if you have entities with *different* sets of components. And you end up in a situation where not all entities *have* all components; so: you either bite the bullet and have "holes" in your arrays, or you do some sparse set magic (like what EnTT does), or you use archetypes (like Flecs does, and what I decided to do). IIRC I think Bevy lets you pick at runtime what sort of component storage you use (sparse sets, or archetypes). Depending on your data access patterns, different component storages would give you better performance.
So I think the "irregular shape" you're mentioning is that of the entity, not the component. The components are still packed tightly, just like the cache prefers.
Hope that helps!
@ yes that does make sense. Thanks for a pretty complete answer 😀
Interesting idea. But man that's a lot of overhead just for a few doubles and so many iterations to get to the actual data.
I'd just make an array of doubles, keep track of each entity/component start index and length in the array and then for integration send that entire array to the integrator (s += dt*ds). And if there are colliding objects I'd not give each object access to each other object in the world, I'd pre-cluster them so that each one checks a handful but not millions of entities.
Appreciate the comment. I think it feels like a lot of overhead when laid out in the video, but the core of the loop is still what you described (ie pass an array to an integrator or system function). The overhead is only needed when fetching that array (which would be once per tick). Incidentally, some of that overhead is still present when looking up one individual entity, but I'm looking into ways to reduce that if I can. For my cases, looking up individual entities is somewhat uncommon, but still - always a way to improve!
The challenge with just making a single array of doubles and tracking a section of that for each entity is that it doesn't scale well (cache-wise and memory-wise) when the entity has a lot of components. It also causes a decoding problem of "How do I decode this chunk of floats/bytes into my entity struct" - or if you pack a bunch of structs in a big array you'll just waste space when entities have blank data. Obviously, these could both be reasonable approaches depending on your game/data layout.
I don't think I had collision code in this particular video, but yeah, It's more optimal to pre-cluster your colliders either via spatial hash or KD trees. Hoping to look into that for the future!
Thanks for watching!
@@UnitOfTimeYT Thanks for the reply. Well I guess it depends on how much data you actually have and what kind of different entities are needed, as you said.
Just wondering, why can't you just put all like components in their own arrays? Each component would just need a reference to affect its entity, and this would work fine right? This sounds simpler than making archetypes, but I'm curious why I'm probably wrong.
I might not understand your exact question, so let me know if my response doesn't answer it.
You can think of one Entity as being a column created by pulling a component out of every array at the same index. So if we pull index 5 out of every component array, that gives us the full entity at index 5.
The main problem that led me to archetypes is "What do you do when you have entities who don't have a specific component?". If I have one big array for all the health components, and but then index 5 is a rock with no health, I still have to put something in the index 5 health box. There's a couple ideas:
1. Make it a health pointer, then just null when its not there: This means all healths need to be pointers which means a lot of extra dereferencing which could make us slower
2. Add a bool to indicate that the entity doesn't have health: This is similar to health pointer, but wastes some space every time an entity doesn't have a health component.
The biggest problem with the ideas 1 and 2 is that: If we have a system that draws healthbars. Let's say that we have 1M entities and 1k of them actually have the health component. If we have one big array, we have to loop over all 1M entities; first checking if they have the health component, and second drawing the heathbar.
There's two solutions I know of to the "one big array" problem:
3. Archetypes like I discussed
4. Sparse sets - Which is where you still have one big array for all components, but you do some mapping magic to compress a sparse array to a dense array. This lets you just delete all of the holes
Both archetypes and sparse sets come with their own tradeoffs. I personally preferred archetypes because it made more sense to me.
Just as a little side-note, there's two reasons you might structure your game this way:
1. Organizationally, a gamedev it's nice to dynamically add and remove components from your entities, and then based on those components run different systems on them.
2. You have so many entities that organizing the memory becomes important
Performance may not matter to you, and in that case you can definitely implement some more simpler ECS patterns if you want. Also organizationally, you might not need that much configurability (in fact, you may not even want it). So there's definitely reasons for and against using an ECS in general.
Hope that helps!
@@UnitOfTimeYT Thanks for taking the time to explain, I appreciate it. I just realised that I was asking why we don't use the slow ECS, which you answered in the video already 0_0. I guess what didn't click at first is why the 'slow ECS' is slow, and I kinda understand now.
@@Shack263 All good! Happy to answer :) Yeah IMO at this level of performance optimization it becomes really situational on which memory organization will become faster, and it's not super obvious why until you really dig into it.
Hi, great video! What do you use to create those illustrations ?
Hey thanks! For some of them (specifically the bubbles floating around) I pointed my opengl framebuffer to ffmpeg and encoded it to a video. For the graphs, I used the very popular and featured library, manim. And for block diagrams I usually use draw.io. Then I render codeblocks to images using this tool called marp. Now that I think of it I'm using a lot of different things lol
Sorry I thought this comment was on my Rust vs Go video. For this one it's mostly draw.io for diagrams and then marp for generating slides! Got the idea from the NoBoilerplate who does slide-based video presentations.
Great!
Very cool system
I was playing with it as I have recently just started using Go.
However, I have a question... If you were to create a system that says relies on more than 2 components, would you have to write a new Implementation of query2 for each query you would need? Or would code generation come in handy here? Sorry if this is a dumb question, not really an expert Go developer.
Thanks for the video by the way. So cool to watch.
Hey thanks. Yeah unfortunately Go doesn't support variadic generic arguments, Go also doesn't support overloading function names. So If I want to iterate over 9 different components in my ECS, I have to write a query9 function (for example). This seems to be how the Go authors do it, when I've seen them write functions that do Map2() Map3(), etc. type functions. Luckily, I haven't had to iterate over more than 3 so far. I also think that if your components are so sparse to where you have to iterate over a lot of different arrays at the same time, then you might need to reorganize. But what you pointed out is definitely a very important callout and limitation of my ECS. Thanks for pointing that out!
Cache coherence might not be the correct term to use here as it refers to the differences between multiple caches that reference the same memory space, I think that it would have been better to use cache locality or cache miss rate for this example, either way the intention is understood from explanation and visuals.
Ah yeah you are definitely right. Thanks for calling it out, my mistake!
Archetypes should not be intersecting sets. The set "position, velocity, sprite" already contains the set "position, velocity". You can read the former without accessing the sprite as if it was the latter set. This prevents you from keeping track of and updating more sets than you actually need. The sets "position, velocity" and "position, sprite" are not intersecting and would be the kinds of sets you actually want to keep track of in this way. Other than that, I really liked the video.
Yeah that could certainly be a way to reduce the number of archetypes you have. Practically, I'm not sure how you'd heuristically decide in a simple way which archetypes fall into others without ending up with one mega archetype that contains every possible component. In practice though, I do typically have very similar archetypes that are like "all of these components, but also this one" so there is maybe some potential for optimization there.
Hey, I'm currently working on implementing an ECS in rust. I decided to use a sparse-set based system instead of archetypes, but my question is around how you managed to multithread your ECS? Are there any resources you can point me towards, as hours of googling have given me really unspecific results that haven't been any help.
Thanks.
Hey, Yeah I think there's two main ways you might multithread:
1. You could multithread different systems. You can think of your set of systems as a Directed Acyclic Graph (DAG), and then figure out the dependencies between each system. Then you can run two systems at the same time if they don't depend on eachother, and if they don't use the same underlying storage.
2. Inside of each system, if you have a large enough dataset, you can parallelize the execution of your function on the underlying dataset. I'd guess that this is much easier for archetype based ECS's. Mostly because in archetype ECS's you can just divide up indexes because each component array has the same order (with respect to entity). My understanding is that for sparse sets, different arrays can have different entity orders which makes it hard to divide up what work needs to be accomplished.
I think if you check the description of this video there's two links for two famous ECS authors, you might check out their articles. They probably have more information about this ( I don't have specific references tho, sorry)
@@UnitOfTimeYT Hey, thanks for the reply.
I did some research on implementing this in rust; it would certainly be difficult and would be macro heavy or have clunky syntax, beyond actually developing a concrete multi-threading implementation.
It leads me to think; is multithreading actually an upside? For reference, the game I developed this ECS for is a Rogue-lite. Adopting a multithreaded system will mean that very few things can be truly parallel, as there must be a sequential order of events in terms of systems (physics comes before rendering), and because of the nature of a rogue-like bullet hell, many systems can affect other parts of systems. This makes the overhead of having a multithreaded system questionable, since spawning threads is not cheap and figuring out the dependencies of systems isn't either.
In addition, for things that affect the world, you'd need to delay them to some sort of sync point. You can't have two systems concurrently spawn an entity with the position component unless everything is behind RW (Read-Write) Locks, which also makes the performance of such a system questionable.
What have you found in your game? With this, I developed an ECS for developer ergonomics with performance as a convenient upside. I'm not sure multithreading would be good for either.
Yeah I think you're rightfully calling out a lot of the difficult challenges with multithreading. You might take a look at how bevy does it as well. I believe they use both of the approaches I mention in my previous comment. For approach 1, you'll really only get gains if your game is such that you have separate systems that don't really interact, then for approach 2 I think you're more likely to get gains if you have large sets of data that don't interact.
In my own ECS library, I currently don't have much in the way of multithreading. Though I've intentionally left room for myself mostly in approach 2. Just from my game systems, I've observed that (other than the main draw calls, which aren't really multithreadable) there are a few systems that the majority of every frame time: collision detection and batching dynamic geometry. I think optimizing the systems which take the most time is probably your best bet for making your game run faster.
Yeah, you'll definitely need sync points to add new entities back to your underlying storage. Bevy has "Commands" that they buffer inside a system and then execute at the conclusion of the system. I'm planning to eventually do something like that, but haven't really finalized it much past a basic prototype. I don't think the overhead of locking here will cause too much trouble, but I don't really have any data to confirm that, so I guess I'll find out as I do more profiling.
Right now my game isn't really big enough to warrant much multithreading and I get most of my optimizations by just improving the rendering code. Most of my concurrency surrounds the IO part of my game: ie pulling in packets and decoding them is done concurrently. And then, most of my scalability problems (with things like rendering and physics) I've been able to solve by just using better algorithms.
Hope it helps!
@@UnitOfTimeYT Yes, this has been really productive.
I think the correct approach for me is to develop some sort of task/job system multi-threading model and have systems run sequentially with access to some sort of job resource which enables me to multithread on a case by case basis. For example, if I implement a procedural generation algorithm which has potential for multtihreading, I could do it specifically in that system without an overarching approach, and everything else would run sequentially.
I also agree with you in the sentiment that my game is not big enough to warrant multithreading; the game I'm setting out to create will be 2.5d and performance will almost certainly not be an issue.
I have also gandered at bevy's, specs, and legions source codes to varying degrees and have read about their multithreading implementations. The conclusion I've come to is that it is way too complicated for just me to tackle.
Overall, after doing so much research, I'm just overall not sure multithreading for this specific project is a net gain. The benefits will be limited, and if needed, I could multithread on a case by case basis. Much of my development decisions have been driven by EnTT and the article you linked by its creator, and after looking at EnTT, it seems they adopt this sort of approach as well.
Noot Noot
careful with the "add" and "remove".
in game dev, there should be no adds or removes, if by that you imply memory reallocation.
have fixed size arrays, do not reallocate. ever.
predetermine how big the arrays must be, limit object instantiations, keep track of the arrays actual size (index pointing to the "end" within the big array).
and simply juggle around data within it.
Yeah I'd generally agree that it's good to reduce and remove memory allocations. In my ECS there's an initial growth period as arrays get sized up to hold more entities, but at some point the array gets large enough for the application and entities are essentially "pooled" in terms of their memory.
Eh I disagree. For my framework, both for CPU/RAM and GPU/VRAM allocation I allocate in chunks. It means you don't have to pre-allocate all the data you might ever need and if you only re-allocate when you go 2 chunk sizes above or below the current size you also don't get the problem where removing a little bit and adding a little bit each frame re-allocates every time. Using "never re-allocate, ever" as a mantra just makes your life unnecessarily hard. If it turns out to be a problem later you can always optimize it then. It seems to me the "never re-allocate" is more a dangling pointer/reference hell concern than it is a performance one.
@@stysner4580 your game's resource requirements are non-deterministic then.
its better to allocate 8gbs or so once on startup or map loading and be done with it, instead of having hiccups every now and then just because you pass a threshold and need to allocate new chunks or free some chunks...
@@tanko.reactions176 On modern hardware allocating another couple of megabytes isn't a huge issue at all though (of course not every frame). If you just freewheel it and you run into hiccups you can always implement some very simple object pooling and do your allocations there. Most games allocate a bunch of stuff at startup and the rest uses some data streaming for assets, but in that case you should have object pools or similar; to allocate all the memory you might need at application startup is a bit too oldschool for my liking. Memory arena's with chunks give you flexibility and performance. If you have different settings that might impact how much memory the app needs you'd have to have some way of calculating the capacity of memory needed for every combination of settings. Just seems like a hassle for no good reason.
How to beat the slow allegations 😅
Hi, just a hint by some none native english speaker: you speak way too fast and you seem to have removed the pauses between sentences. It's therefore hard to follow your explanations. I need to rewind every now and then which doesnt always leads to better understanding. Which is a pity since you seem to know what you're talking about.
Thanks for the feedback! Yeah I agree, especially in my earlier videos I talk way too fast. Will try to continue to improve this for future videos!
I am not sure I get the point. It sounds like this is just trying to fix problems that are otherwise fixed by just writing code properly.
Can you outline exactly why you would want to structure your code like this please?
Sure, this sort of technique is mostly used in game development, where you have relatively tight performance requirements and also often times need very flexible object creation (where you can dynamically add and remove components at will). This way of organizing your data, as a struct of arrays of components, will get you better cache performance than if you organized your data in the opposite direction (ie as an array of a struct of your components). Definitely not a fit for every software project. Hope that helps.
@@UnitOfTimeYT It does thanks but aside from the performance benefits which I think you could mostly get by not mangling your domain objects with a ton of needless inheritance and then using the appropriate data structures ... why is it worth essentially denaturing your codebase.
It feels like a really horrible way to organise your code from a DX point of view.
Whats it like to work with?
@@sacredgeometry You're stuck in OOP land. You never get the diamond pattern issue with ECS. You need to really shift your reasoning about data structures and architecture to understand why ECS has benefits. Refactoring in ECS is incredibly convenient, you can just define a new component and write a system that takes a certain set of components, it's completely separate and can be deleted to get back to where you were without any refactoring of existing classes or reimplementing certain templates/interfaces.
Performance is always the first thing to be mentioned with ECS because it's a DOD pattern very popular in game development, but even for game development the greatest benefit in my eyes is the architecture of the codebase itself. You can mix and match to your hearts content; you can create any entity type you want and all your data and code is separated and modular.
@@stysner4580 Performance isnt a consideration until its a problem. Being able to rationalise your codebase is in most applications is far more important and I am not stuck in OOP land I am stuck in the "your problem is one of your own making" land as all the problems they are talking about have nothing to do with OOP and everything to do with how they have structured their code.
Not only that but for the most part can still exist in ECS as well as literally any paradigm. Its literally just a problem of inherence vs composition and plenty of functional approaches have extensive and inefficient type hierarchies that would not only suffer this problem but make it worse because of the added nature of preferred immutability.
@@sacredgeometry I can tell you've never made a game by your performance statement. "Premature optimization is the root of all evil" is such a stupid quote; if it takes 25% more time to implement something you know will have to be optimized later anyway, you should just do it right away. Also, saying "OOP is not a problem if you structure your code right" is really, really weird when ECS is literally another way of structuring your code.
Your second point is also pretty moot since I created a custom ECS in Rust which works just fine, even though Rust would be the worst language to use if your statement was correct. It really doesn't matter what language you create an ECS in, as long as there is some kind of type erasure and interior mutability pattern for a language you can implement an ECS just fine.
I think you've never written a sizable project in a non-OOP language, correct? Because this is what you see for any functional language or any non-OOP language; people say there's so many problems with them because it's more complicated at the start of a project. But the real scalability powers of composition over inheritance and very strict type systems only shows when your project reaches a certain size.
ECS isn't the solution to everything, but it is the kind of paradigm shift you have to have implemented in a sizable project to see the benefits.
just use c++.