*@MaxMakesGames* 9:00 Optimization suggestion, if you could somehow add a bit to each voxel that indicates if the entire voxel is only air, then you can cancel that search MUCH sooner & move onto looking into the next neighboring voxel. In that way you might be able to skip through the first eg: 3 "super clusters", then only start searching inside the clusters when you reach an edge, and all it costs is like +1 byte (ideally 1 bit, but computers only work in bytes, maybe you can use the remaining 7 bits for other data?) of storage per octree "container"
I already kinda do that. I mean if a big node has nothing placed inside it, it won't have children so the ray will skip over the whole thing. When there's no voxel in an area, I don't place air voxels I just don't place anything. When a node doesn't have children I just move to the next :)
What you're doing on voxels is called ray marching, advancing the ray in steps until it hits something. Pretty cool way to render octrees or signed distance functions. Also reflection/refraction shows its glory if you add Fresnel and total internal reflection. Without it, it's going to look crappy no matter what you do with it.
Nice job on the raytracing! It was especially cool to see the water reflections and such at the end. It was also cool to hear you got the FPS pretty stable, similar to the last video
If you take a look at how the "multiple frames averaging" works in big budget games with ray tracing, they use things like denoisers, and temporal reprojection. This makes it possible to have complex ray tracing effects and even path tracing in real time, on decent quality. In essence, when the camera moves, you dont drop the result of the previous frame but try to transform it in a way that it fits the new picture. There will be some disocclusion and new things coming in on the edges of the screen, and you only start from scratch on those pixels. Also, instead of starting from scratch, you could render basic rasterized lighting on those pixels, and fill it in with ray traced effects over the next few frames. Denoisers also works in a temporal manner. This results in a "lazy" update but there won't be popping of ray tracing effects and such. Its more like a smooth progression.
A bit strange that the destruction code is slower now that you aren't modifying a mesh, is the world generation slow now as well? Also I thought you taking advantage of spatial locality in the octree structure was pretty smart, good job!
Thanks :) As for the destruction code being slower, that is something I'm currently fixing. The reason for this is that my raytracer can't receive directly my octree since the octree works with objects/references ( each node owns child nodes that own child nodes... ) and that can't be used in a shader, so everytime the map changed ( load, unload, destroyed ), I had to build a "copy" of my octree in a way the raytracer can use ( all the nodes in a vec ) and update the shader's buffer. I also had to calculate the neighbors and cache that to speed up rays, but that was pretty slow, especially when I recalculate all the map every edit. I'm currently making it so I keep both "versions" of my octree updated at the same time so I dont have to rebuild the whole raytracer version everytime it changes. So far it's kind of broken but I can already tell that it is much faster. I'm gonna show that in the next video ;)
This is really cool! If you're struggling with performance a really good optimisation is something called progressive rendering. Its where you render only a quarter of the frame but accumulate frames over time. So that you can render it at 960x540 instead of 1920x1080 with the trade off that there will be a bit of motion blur
Very cool! I've been thinking about doing this as well. One thing I thought a lot about is how the raytracing part would actually work. There are basically 3 types of raytracing that I know of: - First, is the ray stepping which is what you did. - Next is ray marching, which is similar, but instead of stepping one unit every iteration, you step the most amount you can without hitting something for sure. There are many great videos on ray marching and SDFs and I recommend checking it out. - The last one is what I encourage you to look into the most, which I call ray calculating where you do some math to see if the ray hit a cube in a single step! "The Cherno" made a very good explanation on the math behind it, so check it out if you're interested! What's different about ray calculating is that you'd have to check every single voxel to check if the ray hit it, BUT I've been also thinking about octrees, and you can check if the ray hit (or is inside) the highest parent first, if it does then check it's children for the same, then those's children and so on and so on. By the end, you should have only a couple of possible voxels left, all on the ray, just at different distances. Then you can just iterate through those and get the closest. It might not be faster than ray marching, but I personally think it would. I hope my explanation was understandable... Good luck whatever you choose!
Oh so you find all the voxels in the path of the ray and then pick the closest ? That's pretty smart. I wonder if it would be faster. Currently my ray checks around 100 nodes, so it's already pretty optimized... I might give it a shot to see. Thanks !
There are ways around mesh refresh lag, like having double/tripple buffered mesh thats allow to update changes without halting main thread, but tbh voxels are acually well suited for ray tracing
Good stuff Max! The visualizations of orthographic versus perspective projection were very helpful. Since you're using Bevy now, are you using WGPU and WGSL? What do you think of the API, or are you still working in GLSL?
Thanks ! Yes I'm using WGPU and WGSL. I think it's pretty similar to OpenGL and GLSL except maybe to setup the bindings like the buffers. Once you got that done it's very nice. I'm thinking maybe of posting a little demo raytracer code using bevy on GitHub with my raytracer but not the whole project you know just the base to help people get started. Let me know if you're interested. :)
Thanks ! I'm not using any hardware stuff. Partially because I don't know how haha but also because my raytracer is optimized to work with my octree, which I doubt hardware RT is. I also want to make sure as many people can run this as possible for when I make a game in it. I just have a compute shader that traverses my octree with rays and sets pixels on a texture. :)
@@MaxMakesGames makes sense. One thing worth thinking about is whether the octree is actually faster on the GPU? Assuming that your voxels are at integer intervals in world space, you can just truncate the current check position for a ray and know which block you're in. The modulo operator would give the position within that block, too!
That is a very good idea and in fact I did something similar when I used chunks to get voxels. But there are problems with that method: - The intervals have to be equal, meaning every voxel has to be the same size and all the space has to be filled so if I want small grass pieces, I'd need every voxel to be that size. - The ray would have to check every interval until it hits something. With small intervals for the reason above, that would mean a lot of steps for each rays. The mountain may be 200 voxels away so 200 steps. Currently my limit is 100. And most of the steps would be checking empty space. I use an octree because it allows for different sizes and empty space. Even if my grass pieces are really small, the rest can be bigger so it takes less steps for the ray to go through and if there's a large empty space ( like between 2 mountains ) it will be filled with minimal empty nodes so the empty space can be traversed in 1-4 nodes ( of let's say size 25 ) instead of 100 with the fixed intervals ( of let's say 1 ). And I cache the neighbor of each node for the raytracer so it's as fast or even faster than using the position and interval. I hope I explained that well.
Thanks ! No I don't think there's anything like that. It's just moving from the camera through the octree until it hits a voxel and renders its texture for each pixel. It will definitely be helpful to add effects later though, since I control the whole render process.
@@MaxMakesGames Right, but that process only ends up rendering voxels that the camera can actually see I assume. My voxel engine doesn't struggle with FPS right now, but I know when my world is further sculpted I will likely begin to struggle. I currently render every face in a chunk, if the chunk is within the view frustum, but this ends up rendering a lot of meshes the player can't see. I don't know if this method would be faster, as it seems like every time the camera moves you have to recalculate the meshes within the camera and load them all into a buffer. But I can't see how you could possibly be doing all those calculations everytime the camera moves at all and still maintain FPS, seems very very expensive.
I am not doing any calculations. I don't need to. When the ray goes through the octree, it moves in the direction of the camera and hits things there. Any voxel outside the view of the camera just wont be reached by the ray, I don't need to cull them at all. That's the beauty of raytracing :) I send the whole octree to my raytracer and it just gets the data from the nodes it needs starting at the main big ones and going down and then moving to the neighbors until it finds one that is filled. If a node is behind the camera ? It won't be checked because it wont be in the path. It's like the ray is making little steps. It only needs to check the octree node at the next step to move, unlike meshes that need to check all the faces to see if they fit in the camera matrix.
@@MaxMakesGames that makes sense. So once you have found all of the visible voxels, do you then construct a new VBO and EB containing all those voxels whenever the set of visible voxels changes? That's the part I'm struggling with, I think I am going to implement this as a test, because it also can solve the problem of determining order to draw transparent blocks as a byproduct
Yea whenever there is a change in my voxels ( load, unload, destroy ) I construct a new buffer ( in my case it's a storage buffer because I use WGSL but in GLSL it would probably be a VBO ? ) with my octree data ( all the nodes with their positions, sizes, voxel id and neighbors ) and send that to the shader to use for render. Building the buffer can take a bit of time so you can build it async and once its done send it to avoid lag spikes. Then the shader uses that buffer to traverse the octree and detect what each ray hits. I find that building the buffer is a lot faster than building the mesh from the octree/chunks ( like I used to do ) and sending the buffer is a lot faster than spawning/updating the mesh was so that's great. And I don't have to worry about culling voxels and faces because the raytracer doesn't need that :)
Well your problem is probably not the same, but in my case it was that my ray was taking too big steps. I was moving the distance of the parent node when I reached an empty node with no children
*@MaxMakesGames* Congratulations! You actually achieved what the YT channel *Euclideon* (original moves deleted, unfortunately) could not, using an octree + ray-tracer & voxels is a really smart combination! See Reply for link (in case YT is stupid & auto-deletes the comment).
Great results! I'd like to know one thing. How do you load/unload the world on the GPU using an octree? I'm also making a voxel game and using a chunk system, so loading chunks into SSBO is easy. But it’s not very clear about the octree
I send a buffer of all of the octree nodes with their pos, size, etc and the indices in the buffer of their children so then in the GPU I can use the buffer and go to the index of a children to move down it until I reach a leaf or a filled voxel. Hope that's clear :)
@@MaxMakesGames Yes, this is understandable, I meant how a local section of the octree is loaded. That is, the visible area. After all, the GPU has a rather limited buffer size and you cannot load the entire octree there
@@Duxen8956 well I seperate my world into "main nodes" that are big and I load the world into them. As I move around, I remove main nodes far away and load ones where you moved. All the main nodes are sent to the shader. A node is only around 32 bytes I think ? So even if I send a million nodes it's around 32MB of VRAM used it's not that much. It's limited by the lag of stepping through the nodes to cast the ray more than the size of the octree and memory tbh
Haha well that's possible, I never understood the differences. They all kind of do the same thing tho, right ? Both trace rays to get results. I think one sends the ray in random directions while the other does math to get the result, but whatever, raytracing sounds cooler :)
This is in fact ray tracing. RT sends rays from the player camera POV into the scene and records the color values. PT starts from light sources in the scene and then renders based on rays that hit the camera.
@@DigitalJediNope! that’s another whole can of worms called light-based path tracing (compared to eye-based), raytracing is simply the simulation of how rays of light interact with objects in the scene, while path tracing only takes those rays into account if they then hit a light source, including recursive rays to achieve that sweet, sweet indirect lighting!
This is definitely ray tracing, path tracing is all about the ray’s interaction with light, compared to just checking the dot product with the direction to the light source 😅
@@KaidenBird so to understand this would have been path tracing if rays where cast from the traced ray towards the light source? or what makes the difference here? Also now that I think about it is path-tracing a sub-category of ray-tracing?
I am not planning it yet because I don't think it's a big deal to smooth things especially since voxels are kinda blocky and pixely already, but maybe one day.
Thanks for being interested. There isn't much gameplay right now so I don't have anything released. Of course once there is some gameplay and it is playable I'll release it so people can play ! If you are talking about the code, the code right now is very messy and unclear so it's private. However, I'm currently considering that I could clean the code and post the code on github once the raytracer is done and working well. I'll think about it.
@@MaxMakesGames ray marching is using variable step length for each ray, where the length is determined by an estimation function, which is able to show you the shortest distance to the object. It allows you to take bigger steps, safely, because you know that within that distance there is no geometry. Look up CodeParade's Marble Racer, where the ray marching is used to render highly detailed 3d fractals, on which you race to the flag. It even uses the estimation function for the physics.
Dont forget you still have a depth buffer! You can do polygons that are obscured by depth if you need to
*@MaxMakesGames* 9:00 Optimization suggestion, if you could somehow add a bit to each voxel that indicates if the entire voxel is only air, then you can cancel that search MUCH sooner & move onto looking into the next neighboring voxel.
In that way you might be able to skip through the first eg: 3 "super clusters", then only start searching inside the clusters when you reach an edge, and all it costs is like +1 byte (ideally 1 bit, but computers only work in bytes, maybe you can use the remaining 7 bits for other data?) of storage per octree "container"
I already kinda do that. I mean if a big node has nothing placed inside it, it won't have children so the ray will skip over the whole thing. When there's no voxel in an area, I don't place air voxels I just don't place anything. When a node doesn't have children I just move to the next :)
@@MaxMakesGames Aha, I see, that's obviously even smarter 👍
What you're doing on voxels is called ray marching, advancing the ray in steps until it hits something. Pretty cool way to render octrees or signed distance functions. Also reflection/refraction shows its glory if you add Fresnel and total internal reflection. Without it, it's going to look crappy no matter what you do with it.
Nice job on the raytracing! It was especially cool to see the water reflections and such at the end. It was also cool to hear you got the FPS pretty stable, similar to the last video
This is really cool. Im excited to see how it looks gene you’re able to incorporate everything back in.
The glass refraction is sick.
If you take a look at how the "multiple frames averaging" works in big budget games with ray tracing, they use things like denoisers, and temporal reprojection. This makes it possible to have complex ray tracing effects and even path tracing in real time, on decent quality. In essence, when the camera moves, you dont drop the result of the previous frame but try to transform it in a way that it fits the new picture. There will be some disocclusion and new things coming in on the edges of the screen, and you only start from scratch on those pixels. Also, instead of starting from scratch, you could render basic rasterized lighting on those pixels, and fill it in with ray traced effects over the next few frames. Denoisers also works in a temporal manner. This results in a "lazy" update but there won't be popping of ray tracing effects and such. Its more like a smooth progression.
A bit strange that the destruction code is slower now that you aren't modifying a mesh, is the world generation slow now as well? Also I thought you taking advantage of spatial locality in the octree structure was pretty smart, good job!
Thanks :)
As for the destruction code being slower, that is something I'm currently fixing. The reason for this is that my raytracer can't receive directly my octree since the octree works with objects/references ( each node owns child nodes that own child nodes... ) and that can't be used in a shader, so everytime the map changed ( load, unload, destroyed ), I had to build a "copy" of my octree in a way the raytracer can use ( all the nodes in a vec ) and update the shader's buffer. I also had to calculate the neighbors and cache that to speed up rays, but that was pretty slow, especially when I recalculate all the map every edit. I'm currently making it so I keep both "versions" of my octree updated at the same time so I dont have to rebuild the whole raytracer version everytime it changes. So far it's kind of broken but I can already tell that it is much faster. I'm gonna show that in the next video ;)
This is really cool! If you're struggling with performance a really good optimisation is something called progressive rendering. Its where you render only a quarter of the frame but accumulate frames over time. So that you can render it at 960x540 instead of 1920x1080 with the trade off that there will be a bit of motion blur
Oh yea that does sound pretty good, thanks!
Very cool! I've been thinking about doing this as well. One thing I thought a lot about is how the raytracing part would actually work. There are basically 3 types of raytracing that I know of:
- First, is the ray stepping which is what you did.
- Next is ray marching, which is similar, but instead of stepping one unit every iteration, you step the most amount you can without hitting something for sure. There are many great videos on ray marching and SDFs and I recommend checking it out.
- The last one is what I encourage you to look into the most, which I call ray calculating where you do some math to see if the ray hit a cube in a single step! "The Cherno" made a very good explanation on the math behind it, so check it out if you're interested!
What's different about ray calculating is that you'd have to check every single voxel to check if the ray hit it, BUT I've been also thinking about octrees, and you can check if the ray hit (or is inside) the highest parent first, if it does then check it's children for the same, then those's children and so on and so on. By the end, you should have only a couple of possible voxels left, all on the ray, just at different distances. Then you can just iterate through those and get the closest.
It might not be faster than ray marching, but I personally think it would.
I hope my explanation was understandable... Good luck whatever you choose!
Oh so you find all the voxels in the path of the ray and then pick the closest ? That's pretty smart. I wonder if it would be faster. Currently my ray checks around 100 nodes, so it's already pretty optimized... I might give it a shot to see. Thanks !
There are ways around mesh refresh lag, like having double/tripple buffered mesh thats allow to update changes without halting main thread, but tbh voxels are acually well suited for ray tracing
Good stuff Max! The visualizations of orthographic versus perspective projection were very helpful. Since you're using Bevy now, are you using WGPU and WGSL? What do you think of the API, or are you still working in GLSL?
Thanks ! Yes I'm using WGPU and WGSL. I think it's pretty similar to OpenGL and GLSL except maybe to setup the bindings like the buffers. Once you got that done it's very nice. I'm thinking maybe of posting a little demo raytracer code using bevy on GitHub with my raytracer but not the whole project you know just the base to help people get started. Let me know if you're interested. :)
5:09 Blud accidentally recreated 4D Miner 💀
with my bevy raytracer, i just copy the camera matrix from the camera3d and pass it in a uniform buffer. works great. no need to touch any matrix math
Thats really cool ! Whats ur GPU ?
I have an RTX 3060 but I'll try to optimize things enough that it can run on less performant GPU too
@@MaxMakesGames Alright, Thank ! It is still impressive, keep going dude
Awesome work!
Are you using hardware accelerated RT or software?
Thanks !
I'm not using any hardware stuff. Partially because I don't know how haha but also because my raytracer is optimized to work with my octree, which I doubt hardware RT is. I also want to make sure as many people can run this as possible for when I make a game in it. I just have a compute shader that traverses my octree with rays and sets pixels on a texture. :)
@@MaxMakesGames makes sense.
One thing worth thinking about is whether the octree is actually faster on the GPU?
Assuming that your voxels are at integer intervals in world space, you can just truncate the current check position for a ray and know which block you're in. The modulo operator would give the position within that block, too!
That is a very good idea and in fact I did something similar when I used chunks to get voxels. But there are problems with that method:
- The intervals have to be equal, meaning every voxel has to be the same size and all the space has to be filled so if I want small grass pieces, I'd need every voxel to be that size.
- The ray would have to check every interval until it hits something. With small intervals for the reason above, that would mean a lot of steps for each rays. The mountain may be 200 voxels away so 200 steps. Currently my limit is 100. And most of the steps would be checking empty space.
I use an octree because it allows for different sizes and empty space. Even if my grass pieces are really small, the rest can be bigger so it takes less steps for the ray to go through and if there's a large empty space ( like between 2 mountains ) it will be filled with minimal empty nodes so the empty space can be traversed in 1-4 nodes ( of let's say size 25 ) instead of 100 with the fixed intervals ( of let's say 1 ). And I cache the neighbor of each node for the raytracer so it's as fast or even faster than using the position and interval.
I hope I explained that well.
@@MaxMakesGames Compute Shader is something which by definition runs on SHADERS meaning your GPU... so it IS hardware RT, at least partially xddd
if you plan to use this in a future game and need music for it, hit me up.
Wow! I'm working on a voxel engine right now as well. This seems incredible! Is this effectively occlusion culling?
Thanks ! No I don't think there's anything like that. It's just moving from the camera through the octree until it hits a voxel and renders its texture for each pixel. It will definitely be helpful to add effects later though, since I control the whole render process.
@@MaxMakesGames Right, but that process only ends up rendering voxels that the camera can actually see I assume. My voxel engine doesn't struggle with FPS right now, but I know when my world is further sculpted I will likely begin to struggle. I currently render every face in a chunk, if the chunk is within the view frustum, but this ends up rendering a lot of meshes the player can't see. I don't know if this method would be faster, as it seems like every time the camera moves you have to recalculate the meshes within the camera and load them all into a buffer. But I can't see how you could possibly be doing all those calculations everytime the camera moves at all and still maintain FPS, seems very very expensive.
I am not doing any calculations. I don't need to. When the ray goes through the octree, it moves in the direction of the camera and hits things there. Any voxel outside the view of the camera just wont be reached by the ray, I don't need to cull them at all. That's the beauty of raytracing :)
I send the whole octree to my raytracer and it just gets the data from the nodes it needs starting at the main big ones and going down and then moving to the neighbors until it finds one that is filled. If a node is behind the camera ? It won't be checked because it wont be in the path. It's like the ray is making little steps. It only needs to check the octree node at the next step to move, unlike meshes that need to check all the faces to see if they fit in the camera matrix.
@@MaxMakesGames that makes sense. So once you have found all of the visible voxels, do you then construct a new VBO and EB containing all those voxels whenever the set of visible voxels changes? That's the part I'm struggling with, I think I am going to implement this as a test, because it also can solve the problem of determining order to draw transparent blocks as a byproduct
Yea whenever there is a change in my voxels ( load, unload, destroy ) I construct a new buffer ( in my case it's a storage buffer because I use WGSL but in GLSL it would probably be a VBO ? ) with my octree data ( all the nodes with their positions, sizes, voxel id and neighbors ) and send that to the shader to use for render. Building the buffer can take a bit of time so you can build it async and once its done send it to avoid lag spikes. Then the shader uses that buffer to traverse the octree and detect what each ray hits. I find that building the buffer is a lot faster than building the mesh from the octree/chunks ( like I used to do ) and sending the buffer is a lot faster than spawning/updating the mesh was so that's great. And I don't have to worry about culling voxels and faces because the raytracer doesn't need that :)
Damn I'm making the same thing in webgl page, and I'm stuck at the same issue you have at 4:40 . Do you still remember how you fixed it?
Well your problem is probably not the same, but in my case it was that my ray was taking too big steps. I was moving the distance of the parent node when I reached an empty node with no children
*@MaxMakesGames*
Congratulations!
You actually achieved what the YT channel *Euclideon* (original moves deleted, unfortunately) could not, using an octree + ray-tracer & voxels is a really smart combination!
See Reply for link (in case YT is stupid & auto-deletes the comment).
Title: *Unlimited Detail Technology*
By: *Quipster99*
The link-comment got auto-deleted (as expected, unfortunately), but the title & channel comment survived.
Great results! I'd like to know one thing. How do you load/unload the world on the GPU using an octree? I'm also making a voxel game and using a chunk system, so loading chunks into SSBO is easy. But it’s not very clear about the octree
I send a buffer of all of the octree nodes with their pos, size, etc and the indices in the buffer of their children so then in the GPU I can use the buffer and go to the index of a children to move down it until I reach a leaf or a filled voxel. Hope that's clear :)
@@MaxMakesGames Yes, this is understandable, I meant how a local section of the octree is loaded. That is, the visible area. After all, the GPU has a rather limited buffer size and you cannot load the entire octree there
@@Duxen8956 well I seperate my world into "main nodes" that are big and I load the world into them. As I move around, I remove main nodes far away and load ones where you moved. All the main nodes are sent to the shader. A node is only around 32 bytes I think ? So even if I send a million nodes it's around 32MB of VRAM used it's not that much. It's limited by the lag of stepping through the nodes to cast the ray more than the size of the octree and memory tbh
I don't think what you are doing is ray-tracing, I think it is called path-tracing.
Haha well that's possible, I never understood the differences. They all kind of do the same thing tho, right ? Both trace rays to get results. I think one sends the ray in random directions while the other does math to get the result, but whatever, raytracing sounds cooler :)
This is in fact ray tracing. RT sends rays from the player camera POV into the scene and records the color values. PT starts from light sources in the scene and then renders based on rays that hit the camera.
@@DigitalJediNope! that’s another whole can of worms called light-based path tracing (compared to eye-based), raytracing is simply the simulation of how rays of light interact with objects in the scene, while path tracing only takes those rays into account if they then hit a light source, including recursive rays to achieve that sweet, sweet indirect lighting!
This is definitely ray tracing, path tracing is all about the ray’s interaction with light, compared to just checking the dot product with the direction to the light source 😅
@@KaidenBird so to understand this would have been path tracing if rays where cast from the traced ray towards the light source? or what makes the difference here? Also now that I think about it is path-tracing a sub-category of ray-tracing?
Are you planning on implementing anti-aliasing with the new raytracing renderer?
I am not planning it yet because I don't think it's a big deal to smooth things especially since voxels are kinda blocky and pixely already, but maybe one day.
"every 3D renderer is first a 4D renderer" -idk
You should test it on a potato spec pc to see how it runs
oh ton goes brrr il était chelou xD
NICE BROTHER
OMAGA THIS IS SO EPIKO..... CAN YOU NOW ShoW mE ThE FINALE BOSSE
Where and when can I download this.
Thanks for being interested. There isn't much gameplay right now so I don't have anything released. Of course once there is some gameplay and it is playable I'll release it so people can play !
If you are talking about the code, the code right now is very messy and unclear so it's private. However, I'm currently considering that I could clean the code and post the code on github once the raytracer is done and working well. I'll think about it.
@@MaxMakesGames Honestly, I'd download an EXE in the current state just to see performance.
hi bro
Why not ray marching? All your geometry are squares, so it should be simple
That's probably what I do but I don't know all the different ray method definitions and I wanted to focus on what I did not what it's called
@@MaxMakesGames ray marching is using variable step length for each ray, where the length is determined by an estimation function, which is able to show you the shortest distance to the object. It allows you to take bigger steps, safely, because you know that within that distance there is no geometry. Look up CodeParade's Marble Racer, where the ray marching is used to render highly detailed 3d fractals, on which you race to the flag. It even uses the estimation function for the physics.
@@redo1122 Then it sounds like I use ray marching because I do big steps when I can
Alr