One thing I love about circular particle simulations like those is the topological defects that arise when they all settle down, you can clearly see them around 24:17 just before the transition; just like microcrystaline grains, you can see the domain walls separating the big cells where the spheres are tightly packed. Very cool.
Nice observation, but I think it might just be the video compression though. A defect I noticed is how the very bottom particles with a lot of pressure are inexplicably shifting to the right slowly, which doesn't happen with less particles/pressure, which is probably due to the collision detection
@@keyframe41 defects are supposed to form, that's just how circles pack together. It's very hard for them to pack perfectly after being continually launched like water lol
instead of storing a list of particle indices for each grid you can sort the particles by grid index with counting sort. That way all particles in each gridcel are contiguous and you are not thrashing the cache looping over all particles in each cell.
Cool video! One thing to clarify is that std::move just casts to an r-value, which allows for move semantics if they are implemented for a type. You mention perfect forwarding but for that you would use std::forward() with a templated function of that takes in the r-value type T&&. Perfect forwarding has more to do with how types are converted when passed to a function (look up reference collapsing), and allows for the type that is passed in to be conserved and not changed. Also some additional things to think about for speedups is focusing on making the size of all structs/classes as small as possible; if you can compress the size of the particle for instance you will likely see even more speedups due to better cache coherence. For me this had one of the largest effects on my particle simulator.
@@keyframe41 I made a Barnes-Hut zero gravity particle simulator using SFML like 2 years ago and always get excited about related videos haha. Nice work with this one!
@@keyframe41 Awesome! I'm curious to see how yours goes. For mine I was re-inserting particles into the quadtree every frame and it ultimately was less efficient than just a generic grid for spatial partitioning; but it looked way cooler as I used SFML to visualize the sections dividing into more subsections. If you figure out a good way to keep the structure of the quadtree and just "shake" off dead branches and reinsert particles that move into another cell you may get better performance. People also have algorithms for doing lots of the quadtree related calculations on the GPU, or using SIMD intrinsics for things like collisions which help performance.
Awesome video, the algorithm has blessed me with a W. As for the multithreading performance issue, I may have a solution you can try when you're feeling up to it : Your problem right now is that you have a fixed thread assignment structure, which effectively means that in geometrically diverse simulations (that being the beginning simulation you showed), it's still effectively up to one thread to simulate MOST of the particles. Add that to the extra overhead required to assign, mutex and synchronize the threads themselves, a greater performance overhead is expected. What I advice you try doing (im an amateur myself, so this may not work, but it sounds right) is looking over the cells in the grid themselves to determine how many particles are in them. Then, u can use define a constant that determines how many particles each thread can handle at a time (this constant will take some tweaking to figure out, dw if you dont get it right away), and then assigning as many cells as you can to each thread to fill up that count. This means that one thread will get a bunch of empty cells which will take him the same amount of time to calculate out as one thread that handles one cell that has TONS of particles. Essentially, you are making sure that a bunch of threads handle a bunch of particles, while the other threads handle the almost empty parts of the simulation. This means you are actually parallelizing things correctly, and it should mean that they will take roughly the same amount of time to finish all the calculations. Hopefully, this should work! I'm really impressed with the project so far, it's motivated myself to code in C++ again. Keep it up man!
Damm, man, your vid is fire Love that, I need to subscribe Will love to watch more parts of it, and wish you good at your exams You deserve more views and subs
I was recommended your first video just now and I really enjoyed watching both the parts. I would love to finish my particle physics code some day but I always get too busy with my main projects. Looking forward to more such content. PS : It's nice to find somebody who watches the exact same youtubers I do lol
Amazing work! Im absolutly implementing all your optimization techniques!! You should look into SOA (Structure of arrays) vs AOS (array of structures). Its a small optimization boost due to its cashe performance increase, which I found quite useful when making my particle systems!
you can probably save like a millisecond if you make the grid cells overlap by the diameter of the particles,, that way, you only need to check one cell
Nice work. Would love Jason from c++ weakly to have a look at your threading code. He could well make it go a lot faster. He's like a c++ whisperer. Also watch out for thread::yeald. I've been stung by it in the past. Just sleeping the thread for 10ms can be a lot faster. 🤷♂️
I wonder if a hexagonal (or, really, offset square) grid might be more efficient, since then each cell would only have 6 neighboring cells instead of 8.
so i guess i'll share my 2 cents and basically say that an rvalue is something that can appear _only_ on the right of the assignment of the operator e.g. you can do `foo = 3 + 5` but cannot do `3 + 5 = bar` or `&foo = bar` iirc thecherno made a video about these, great channel if you want to learn some c++ stuff like this
Could you rearrange to produce completely different images. like by normalizing the color between two images so they have the exact same pixels just what ever order
Awesome work here !! There is just something i don't really understand, verlet Integration is usable for a constant timestep (because of the fact i uses the 3 different instants so 2 timesteps) but when using it in a simulation like you did, I get results defying the laws of physics, and I think it might be related to timesteps 🤷, so can you help me a bit on that please ?(I am only starting to loearn C++, which is the reason i don't just read the code). Once again, love your work
I'm not really getting what you're trying to say, but the code for part 1 is considerably easier to read, you can give it a try (the files are main_original, renderer_original, and solver_original)
it could be because he is checking for collisions for a single frame, more than fps which he mentioned in the part 1 and your simulation might not be doing that , I'm no expert tho
Instead of yielding you could do: auto localRamainingTasks = remaining_tasks.load(std::memory_order_relaxed); while (localRamainingTasks > 0){ remaining_tasks.wait(localRamainingTasks); localRamainingTasks = remaining_tasks.load(); } then in the complete tasks after the atomic sub if remaining tasks is 0 then remaining_tasks.notify_one() the perma yielding on an atomic is wasteful.
One thing I love about circular particle simulations like those is the topological defects that arise when they all settle down, you can clearly see them around 24:17 just before the transition; just like microcrystaline grains, you can see the domain walls separating the big cells where the spheres are tightly packed. Very cool.
Nice observation, but I think it might just be the video compression though. A defect I noticed is how the very bottom particles with a lot of pressure are inexplicably shifting to the right slowly, which doesn't happen with less particles/pressure, which is probably due to the collision detection
also if you vibrate them the chunks become bigger, like heat treatment of metals
@@keyframe41 defects are supposed to form, that's just how circles pack together. It's very hard for them to pack perfectly after being continually launched like water lol
The algorithm has delivered yet again
your ability to get so far into a project without quitting is insane
I always do the first 80% so fast and then.. yea. Truly inspiring stuff
instead of storing a list of particle indices for each grid you can sort the particles by grid index with counting sort. That way all particles in each gridcel are contiguous and you are not thrashing the cache looping over all particles in each cell.
Cool video!
One thing to clarify is that std::move just casts to an r-value, which allows for move semantics if they are implemented for a type. You mention perfect forwarding but for that you would use std::forward() with a templated function of that takes in the r-value type T&&. Perfect forwarding has more to do with how types are converted when passed to a function (look up reference collapsing), and allows for the type that is passed in to be conserved and not changed.
Also some additional things to think about for speedups is focusing on making the size of all structs/classes as small as possible; if you can compress the size of the particle for instance you will likely see even more speedups due to better cache coherence. For me this had one of the largest effects on my particle simulator.
Added this to the video description, very informative
@@keyframe41 I made a Barnes-Hut zero gravity particle simulator using SFML like 2 years ago and always get excited about related videos haha. Nice work with this one!
@@thesquee1838 No way, that's exactly what I want to try make for part 4. Thanks for your advice!
@@keyframe41 Awesome! I'm curious to see how yours goes. For mine I was re-inserting particles into the quadtree every frame and it ultimately was less efficient than just a generic grid for spatial partitioning; but it looked way cooler as I used SFML to visualize the sections dividing into more subsections. If you figure out a good way to keep the structure of the quadtree and just "shake" off dead branches and reinsert particles that move into another cell you may get better performance. People also have algorithms for doing lots of the quadtree related calculations on the GPU, or using SIMD intrinsics for things like collisions which help performance.
Awesome video, the algorithm has blessed me with a W. As for the multithreading performance issue, I may have a solution you can try when you're feeling up to it : Your problem right now is that you have a fixed thread assignment structure, which effectively means that in geometrically diverse simulations (that being the beginning simulation you showed), it's still effectively up to one thread to simulate MOST of the particles. Add that to the extra overhead required to assign, mutex and synchronize the threads themselves, a greater performance overhead is expected.
What I advice you try doing (im an amateur myself, so this may not work, but it sounds right) is looking over the cells in the grid themselves to determine how many particles are in them. Then, u can use define a constant that determines how many particles each thread can handle at a time (this constant will take some tweaking to figure out, dw if you dont get it right away), and then assigning as many cells as you can to each thread to fill up that count. This means that one thread will get a bunch of empty cells which will take him the same amount of time to calculate out as one thread that handles one cell that has TONS of particles. Essentially, you are making sure that a bunch of threads handle a bunch of particles, while the other threads handle the almost empty parts of the simulation. This means you are actually parallelizing things correctly, and it should mean that they will take roughly the same amount of time to finish all the calculations. Hopefully, this should work!
I'm really impressed with the project so far, it's motivated myself to code in C++ again. Keep it up man!
Very cool stuff man!
7:42 *vector* OHHH YEAH
oh wow, wow this looks like great quality and very inspiring, im blessed to have found this!
Damm, man, your vid is fire
Love that, I need to subscribe
Will love to watch more parts of it, and wish you good at your exams
You deserve more views and subs
I was recommended your first video just now and I really enjoyed watching both the parts. I would love to finish my particle physics code some day but I always get too busy with my main projects. Looking forward to more such content.
PS : It's nice to find somebody who watches the exact same youtubers I do lol
Very impressive. I could take a guess at your final image location :-)
Excellent work! Keep them coming!
Danm man you're underrated af , I'd love to see more of your work ❤
Amazing work! Im absolutly implementing all your optimization techniques!! You should look into SOA (Structure of arrays) vs AOS (array of structures). Its a small optimization boost due to its cashe performance increase, which I found quite useful when making my particle systems!
Great video man! Please dont make the music too loud.
you could make a circle with just one triangle and drawing the circle inside the triangle using shaders
Cool video! Love to see more
Great vid! You should make some tutorials for people optimistic about coding 😁
you can probably save like a millisecond if you make the grid cells overlap by the diameter of the particles,, that way, you only need to check one cell
Actual beast ❤❤
Wonderfull! Amazing work!
Nice work. Would love Jason from c++ weakly to have a look at your threading code. He could well make it go a lot faster. He's like a c++ whisperer. Also watch out for thread::yeald. I've been stung by it in the past. Just sleeping the thread for 10ms can be a lot faster. 🤷♂️
I wonder if a hexagonal (or, really, offset square) grid might be more efficient, since then each cell would only have 6 neighboring cells instead of 8.
This is fire bro!
awesome video brody
12:07 “Calamitas even.” wait, say that again…
calamitous even
you should think about mixing your audio and compressing the music track so it doesnt distort as much and allows more room to your voice
so i guess i'll share my 2 cents and basically say that an rvalue is something that can appear _only_ on the right of the assignment of the operator
e.g. you can do `foo = 3 + 5` but cannot do `3 + 5 = bar` or `&foo = bar`
iirc thecherno made a video about these, great channel if you want to learn some c++ stuff like this
got it!
Next: move the simulation to the GPU. For 1 million particles.
I did that a few months ago :)
It is able to handle around 1.3 mil objects at 60 fps. Been using CUDA for that
Is that a picture of the hugs building by lumphini park in bangkok?
make the window like a container box for your particles, it will be fun when you resize and drag the window around 😂
i did not see your first video but particle simulations are pretty cool...
Timestamps plz .really helps your video and channel 😊
done, thanks for reminding!
Could you rearrange to produce completely different images. like by normalizing the color between two images so they have the exact same pixels just what ever order
Algorithm gave me second video but not first
all particle sims are in slow mo
Well done, I've got a simple question tho, wouldn't it be easier to draw the image using a simple fragment shader?
Awesome work here !! There is just something i don't really understand, verlet Integration is usable for a constant timestep (because of the fact i uses the 3 different instants so 2 timesteps) but when using it in a simulation like you did, I get results defying the laws of physics, and I think it might be related to timesteps 🤷, so can you help me a bit on that please ?(I am only starting to loearn C++, which is the reason i don't just read the code). Once again, love your work
I'm not really getting what you're trying to say, but the code for part 1 is considerably easier to read, you can give it a try (the files are main_original, renderer_original, and solver_original)
it could be because he is checking for collisions for a single frame, more than fps which he mentioned in the part 1 and your simulation might not be doing that , I'm no expert tho
Instead of yielding you could do:
auto localRamainingTasks = remaining_tasks.load(std::memory_order_relaxed);
while (localRamainingTasks > 0){
remaining_tasks.wait(localRamainingTasks);
localRamainingTasks = remaining_tasks.load();
}
then in the complete tasks after the atomic sub if remaining tasks is 0 then remaining_tasks.notify_one()
the perma yielding on an atomic is wasteful.
holy shit I SUCK!!!
Hi, could you share what VSCode theme you use and which extension you use to get the folder icons? I loved these videos
I use vscode-icons and atom one dark theme (both extensions).
when do we get a kerbal video?
Hmmmmmm, try changing the particles into a fluid? (The movement of the particles is very much like water.)
Stick around for part 3 which will be fluid simulation!
fix the mic take out the bottom 144hz with eq audacity to record voice has it
That image has gotta come from southeast asia lol
why does the wallpaper change every scene
WHY IS YOUR TASKBAR ON THE TOP
its the menu bar
It's macos my man
Only 93 views? What
Great!
was on my fy
Dam dis epic as f*ck
Bruh my laptop can barely run 1000 particles😭
Very nice, go 3D)
🔥
Inspiring ✔
Underrated ✔
Entertaining ✔
Subscribing? ✔✔✔✔✔✔