ALL IT TAKES... A Vulkan Story

The Cherno

มุมมอง 134 262

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 4 ม.ค. 2025
Patreon ► / thecherno
Instagram ► / thecherno
Twitter ► / thecherno
Discord ► thecherno.com/...
Today we're diagnosing Hazel's slow Vulkan renderer.
#Hazel

ความคิดเห็น • 312

@TheCherno 3 ปีที่แล้ว ⁺¹⁸⁰
Hope you all enjoyed the _journey_ - I for one am definitely glad it's over. What's the worst dumb mistake you've ever made which cost you too much time?
@universalponcho 3 ปีที่แล้ว ⁺¹⁸
Not learning to code at a young age..
@jakub7321 3 ปีที่แล้ว ⁺¹⁸
forgetting a semicolon
@LienKim-ry4bc 3 ปีที่แล้ว ⁺¹⁹
Abandon the old project and create a new project which is the same as the old one
@oamioxmocliox8082 3 ปีที่แล้ว
;)
@ptitbgdu8040 3 ปีที่แล้ว ⁺³
Sometimes I change my functions to be way shorter and simpler just to debug the entire code more easily. Exactly like you did when you modified your fragment shader. I do at most probably between 5 and 10 simplifications. And when it's time to come back to normal, I just forget one of these simplifications and it generates a bug 400 new lines of code later. Then it takes me several days to understand that the problem comes from a wrong simplification I did earlier on purpose to debug my code on a previous debugging session...
@dominokos 3 ปีที่แล้ว ⁺⁶⁸⁷
"I don't know. Maybe I'm just doing errors only. Who knows. Maybe i'm a clown. I am a clown." This is the most relatable programming video ever lmaoooo
@jacobm1190 3 ปีที่แล้ว ⁺⁷⁹
Also "maybe the comment is the slow part" at 13:36" lmao. That's the kinda stuff I start thinking in frustration after debugging for hours
@nt4f04und 3 ปีที่แล้ว ⁺⁷
lolololololol
@itzcrazydan 3 ปีที่แล้ว ⁺⁴¹²
This just goes to show you.
Whether you have 1 year or 10+ years experience programming..
It happens to all of us.. one little oversight, typo, you name it.. that will haunt us forever.
@fredg8328 3 ปีที่แล้ว ⁺⁸
Debugging is like 80% to 90% of a programmer's job. You type a line of code and you forget the semicolon at the end, that's already a bug to fix.
@qx-jd9mh 3 ปีที่แล้ว ⁺¹
You can use "design by contract" to prevent messing up subtle invariants....
@po210 2 ปีที่แล้ว
Experience actually will never prevent you from making mistakes. there are other methods that prevent you from making them.
@maxemore 3 ปีที่แล้ว ⁺⁷⁶
The emotional rollercoaster of this video is a great all-around explanation of how programming is :D
@theralf6454 3 ปีที่แล้ว ⁺⁵
**spends 3 days on fixing a bug**
"ALL I DID WRONG WAS TYPE "C" INSTEAD OF "G"!!!"
Lol
@rendoirs 3 ปีที่แล้ว ⁺²³³
Man, let's appreciate for a second the legend that knew what was up with just a screenshot from the profiler. Game engine programmers are of a different breed!
@The101Superman 3 ปีที่แล้ว ⁺²¹
tbf as he mentioned himself seeing pcie usage at near max should have hinted it abundant enough
@jamesmnguyen 3 ปีที่แล้ว ⁺⁷
@@The101Superman Also realizing that someone could accidentally place all their vertex data in system ram. Usually people coding with vulkan really emphasize gpu ram.
@skilz8098 3 ปีที่แล้ว ⁺⁴
Try Verilog with designing the layout of a CPU's internals via it's logic gates, data and address paths...
@gileee 3 ปีที่แล้ว ⁺¹⁶
@@skilz8098 That's college level stuff. This is basically working with a black box running tens of millions of lines of code, with a billion gotchas.
@skilz8098 3 ปีที่แล้ว ⁺⁶
@@gileeeI know what you are saying... but you wouldn't be saying the same thing if you're engineering a CPU and you have to account for every single wire, connection, transistor, etc... then when all of that is properly connected and you think you are done, then it's a matter of designing the ISA and how both instructions and data are represented... from there it's a matter of writing your own assembler. For granted, there are many highly sophisticated tools today to help streamline that process, but imagine having to do that by hand without the aid of any modern computer, device or software! They may teach the basics of this in some colleges and universities, but there is just as much that they leave out! Also, I'm 100% self taught, 0 college education! I took the initiative to follow my own ambitions, desires and goals. I've always been intrigued with electronics.
@TheArmenianSolider65 3 ปีที่แล้ว ⁺³⁴
I love how niche of a video this is and the community that has rallied behind this guy to succeed. Never leave a nerd behind.
@EmanuilGlavchev 3 ปีที่แล้ว ⁺⁵⁴
Oh that was a fun journey... Here's to all the friends that solve our issues with a quick glance and a fresh mindset!
@kristofgyorffi7703 3 ปีที่แล้ว ⁺²⁶⁶
Cherno: "Zooms in on CPU only"
Me: "Starts laughing both hilarious and empathetic, feeling the pain and relief of not finding that one bug for a week, that someone else points out in a second"
@Raspredval1337 3 ปีที่แล้ว ⁺¹⁶
one day I've accidentally used an exit current thread call instead of killing a specific thread. Took me about half a day to figure it out
@Mystixor 3 ปีที่แล้ว ⁺⁹
Haha I thought it was so amazing how his friend pointed it out immediately only from looking at the NSight screenshot
@gittawat6986 3 ปีที่แล้ว ⁺⁷
Man. Computer programming is like blindly assembling machine. You never actually know how all of it working. In fact you might not even remember how you assembling each piece of code together.
@homomorphic 3 ปีที่แล้ว ⁺⁷
@@gittawat6986 no... Just no... Programming is about knowing exactly how everything works. The problem is that schedules are often made assuming that you don't need to know everything, and those schedules are broken, because you do need to know how everything works otherwise you're not engineering you're gambling (that something will work as you want it to).
When you fly on a modern fly-by-wire airliner you better hope that the programmers understand how everything works because that airliner can't stay in the air unless that software works perfectly.
This is why every programmer needs to know assembly and why interpreters are a fundamentally broken concept (they are obfuscation that makes it inherently more difficult to understand how everything works).
@piethein4355 3 ปีที่แล้ว
Yeah this one is hilarius since it is such an obvious thing to run into when writing a vulcan renderer as well
@nolram 3 ปีที่แล้ว ⁺²⁰⁴
You are have mastered the art of thumbnails. This is peek.
@nickvelos9571 3 ปีที่แล้ว ⁺²
Faxxx
@nahu4870 3 ปีที่แล้ว
oh hey, I know you!
@nickvelos9571 3 ปีที่แล้ว
@@nahu4870 you know who?
@_lapys 3 ปีที่แล้ว
Hello, Nolram 👋🏾
@nolram 3 ปีที่แล้ว ⁺¹
@@_lapys Uhm hello ? Where does everyone know me from lol ?
@rainbowpikmin 3 ปีที่แล้ว ⁺²²
So happy to see some Vulkan content on this channel! Keep up the good work!
@CombatFXZone 3 ปีที่แล้ว ⁺¹¹
I have so much respect for graphics programmers. I had to write some OpenGL stuff in uni and was so dramatically overwhelmed by the fundadmentals and terminology. You can spend so much time and energy on this topic, can't even begin to imagine how brutal vulkan must be. Also really cool content, thank you!
@Don-ol9ir 3 ปีที่แล้ว ⁺⁵⁶
Something eerily similar happened to me, only with my DirectX12 implementation. It was waaaay slower than the OpenGL. Like you I fired up the nsight profiler and saw PCI throughput being the bottleneck. Turned out I was still using upload heaps for my vertex buffers (instead of default heaps). I even had a TODO comment there saying I need to fix that. Oh well, learning happened.
@clyde34 3 ปีที่แล้ว ⁺²
As soon as I saw the PCIE graph, I screamed NOOOO and laughed. I knew exactly what was coming.
You had a good intuition to what you might be doing, but were guessing the wrong target.
@skilz8098 3 ปีที่แล้ว ⁺⁵⁵
Open GL holds your hand and makes many assumptions, "Give me your data and I'll try to draw it accordingly".
Vulkan on the other hand is Explicit: "Tell me everything you want to do and how you want it done and I'll do it exactly that way!"
@KangJangkrik 2 ปีที่แล้ว ⁺⁷
Skia: j-just tell me what to draw, but it has to be 2D
@brunomarques5258 6 หลายเดือนก่อน
Hi the cherno, thanks for this content of graphics programming, a little inspired by you, I've started to learn Vulkan and D3D12, and I found the VMA brother the D3D12MA, and now I've pulled request 2 features, all in cmake. I've followed the cmake path, instead of premake that your prefer, but i really thank you to be introduced in this content.
@SimonBuchanNz 3 ปีที่แล้ว ⁺²⁶
The LunarG Vulkan debug layers does in fact gave a bit of performance validation if you ask, quite handy, if fairly minimal right now.
My first guess was that it was just hitting V-Sync 😄. Pretty obvious that it was buffer residency as soon as I saw that PCI bandwidth cheese block, but I wondered if you had somehow managed to allocate your render target images as CPU only!
@Mozartenhimer 3 ปีที่แล้ว ⁺⁷
This is really good content. Love the vulkan debugging stuff. Don't see much of this on TH-cam.
@jimmiebergmann5455 3 ปีที่แล้ว ⁺¹
I feel you! I knew what was coming when you showed us that PCI throughput graph :) I love Vulkan and the control you get, but with great power comes great responsibility. Looking forward to your next video.
@RC-1290 3 ปีที่แล้ว ⁺³⁸
One of Vulkan's validation layers is for best practices. If this isn't on there, you might want to request its addition.
@fahd2372 2 ปีที่แล้ว ⁺¹
This is totally the best video I have seen on youtube so far! I loved every part of it :D Absolutely amazing story!!!
@theoathman8188 3 ปีที่แล้ว ⁺⁶
Thank you for uploading this video. It's very educational. I mean many people face problems like that and once it's over they breath a sigh of relief and move on. However, you took your time to tell us about it and that's really important.
@LS-cb7lg 3 ปีที่แล้ว ⁺³
honestly, this is some quality tv for me :D glad you did it!! keep it up
@mattkey7226 3 ปีที่แล้ว ⁺³
This was actually so useful to see. Thanks for sharing!
@DT-ew2zz 3 ปีที่แล้ว ⁺¹²
I would LOVE to see a Vulkan Tutorial from You
@0xL4 3 ปีที่แล้ว ⁺⁴
24:42 This is implementation defined. The GL_STREAM/STATIC/DYNAMIC_DRAW hints are exactly that - hints. From what I understand, these used to mean something, but driver vendors have more or less stopped caring about them, since misusing them was so common that it was better to just have the driver decide.
@jukit3906 2 ปีที่แล้ว
I think the new ARB_buffer_storage API (Core in OpenGL 4.3) has a more detailed impl of that for glBufferStorage(): instead of hints, it uses flags really similar to vulkan's: GL_CLIENT_STORAGE_BIT will make the buffer cpu side if possible, GL_MAP_PERSISTENT_BIT will make the buffer persistently mapped but likely to be cpu side too, GL_DYNAMIC_STORAGE_BIT will make the buffer able to copy from cpu data but still be gpu-side, and 0 will make the buffer fully invisible cpu-side, therefore fully on the GPU. Persistent mapping is often used for staging buffers btw, and you can use glCopyBufferSubData to copy the buffer'scontentd
@Alexander_Sannikov 3 ปีที่แล้ว
Here's a fun fact: say you have some HOST_COHERENT memory that you're reading from a shader. Say, you're storing there a vertex buffer of your big point cloud (50m+ points). You can get the total size of your vertex buffer, divide it by the time of your GPU pass using this vertex buffer and you'll get your PCI-e speed almost exactly (give or take 2%). Because your shader literally reads your RAM by streaming it over your PCI-e with its full bandwidth, introducing practically no extra latency. I don't know about you, but I find this practically magical how they(hardware guys) achieve that.
@movax20h 3 ปีที่แล้ว ⁺¹
In Vulkan validation layers you can enable "PERF" level, which gives hints about various sub-optimal uses of Vulkan API. I am not sure if it would catch this one, but it is worth a try.
There is VK_LAYER_LUNARG_assistant_layer
, which is essentially designed for these purposes and should detect this issue.
Also, there is nothing wrong that Vulkan allows you do to do that or other stupid things. Using CPU only memory for some stuff sometimes makes perfect sense actually.
I am glad you found the issue and fixed it, and learned something new.
@Saturn2888 3 ปีที่แล้ว ⁺²
As soon as I saw that PCI throughput, I realized it was system RAM, but I didn't realize you could manually do that.
@ietsization 3 ปีที่แล้ว ⁺¹⁶
Honestly, this was pretty comforting because I was expecting a far more stranger vulkan behavior thing. This is certainly unfortunate and hard to debug but the reason for it being slow is very obvious
@Rufnek2014 3 ปีที่แล้ว
I don't write code but darn...love watching the whole process and getting an understanding of game code behind the scenes.
@SeanHarmer 3 ปีที่แล้ว
Hehe, the joys of having all of the control with Vulkan! nSight is such an incredibly useful tool for such things. I spent a few hours last weekend wondering why vkCmdDrawIndexed() was not drawing anything only to then realise that even when not using instancing that instanceCount needs to be set to 1 and not 0. D'Oh!
As someone writing a new Vulkan based engine I'm really enjoying the series. Keep going! :)
@dimitribobkov-rolandez5729 3 ปีที่แล้ว
This makes me feel less bad about my own (terrible) vulkan renderer! Nice video, and congrats on finding the bug!
@Gunslinger962 3 ปีที่แล้ว ⁺¹
Your thumbnails are getting better day by day
@abebarker 3 ปีที่แล้ว
I like those types of hard won nuggets of information. People spend their lives looking for the most useful and precious nuggets.
I have it in my head to build a model of all the pieces of, say the hazel engine, and animate. I know that there are always multiple layers of abstraction between the conceptual object and the metal and that maybe useful to illustrate as well. We will see if I ever make any actual progress, I've got other demands of my time.
@woolfel 3 ปีที่แล้ว ⁺¹
Love this video. I'm always tell young developers "no matter how long you've been programming, you will make silly mistakes." Unit and regression tests are your friend. If you don't have good test coverage, simple typos will bite your butt.
@witchaponkitthaworn5998 3 ปีที่แล้ว
Hello from Thailand, I am no mean programmer of any sort, but watching this I can see how developer actually working, this is a good video to listen while working from home... setting me in kind of working mode..
@playerguy2 3 ปีที่แล้ว ⁺³
I've been trying to make the basic "Hello triangle" app in Vulkan and _"sounds like your vertex buffers are stored in RAM."..._
I felt that.
@jamesmnguyen 3 ปีที่แล้ว ⁺⁷
One time I set the RenderPass attachment store op to DONT_CARE and didn't even realize it. I was banging my head for like an hour wondering why my window was displaying garbage. Vulkan basically did my rendering, saw the enum and just dumped the result into undefined territory, because, after all, the programmer said to discard the output.
This was before I had access to NSight. I wonder how fast I would've solved the mistake if I did have it.
@HobokerDev 2 ปีที่แล้ว ⁺³
You're so lucky to have someone you can ask for help. Imagine looking for this error all on your own. :(
@gerardgonzaleznavarrete8023 3 ปีที่แล้ว
Two minutes in the video I had a clue of your problem (Yeah, it's annoying from all the verbose Vulkan exposes, that one little thing was a game changer :sadface:). It's nice that you've gone through the journey and pointed out the understanding of hardware. Vulkan and DX12 are very explicit in management and control over the Hardware involved, and it was also nice to see a capture to dig deep in the problem solving. Nice video, keep up the good work, Hazel is looking very cool!
@senhorcorvo 3 ปีที่แล้ว ⁺²
Am i watching a man slowly descending into madness while trying to learn vulkan?
@Cleanser23 3 ปีที่แล้ว
fellow vulkan renderer veteran. I am sorry I feel your pain. As soon as I saw the PCI throughput that high I was thinking, he must be copying over every frame.
Congrats on fixing this one line hell
@nexovec 3 ปีที่แล้ว ⁺¹⁵
*stares into the screen for a week.
*changes one letter
*speed go brr
that's scary, man!!
@shavais33 8 หลายเดือนก่อน ⁺¹
Many moons ago I was using DirectX, and trying to get my game to recover from alt-tab or minimizing and restoring the game window. When the window is reactivated, I had to reload all the textures into the gpu from system memory, which meant I had to keep a copy in system memory. And I didn't have enough gpu ram to store all the textures, so I killed two birds with one stone by implementing a caching system with an lru list. Whenever I went to draw something, if my gpu buffer for a given texture was either not loaded or invalid, I'd reload it from it's system memory buffer, and keep track of how much total gpu ram was in use (for textures), and when it passed a threshold, I would repeatedly unload the gpu buffer for the least recently used texture (that was still loaded) until I (theoretically) had enough gpu ram to load the texture I was trying to load. Before I created that system I did suffer from a lot of artifacts and slowness and headaches. After I implemented it, things were much smoother and faster. Many, many moons ago.
@shavais33 8 หลายเดือนก่อน
Why did this video come on my dash today, if it was posted 2 years ago? I keep doing that. I keep necro-posting without realizing it. Argh. Oh well. (I guess it must be because I'm just now finally looking at Vulkan.)
@_rpr1337 3 ปีที่แล้ว
This format is super cool dude
@diligencehumility6971 2 ปีที่แล้ว
Wow a cool friend, he right off the back knows what is wrong with your homemade engine, I need a friend like that
@rituparnadas699 3 ปีที่แล้ว
I clicked only because of the thumbnail, you nailed it this time.
@Zumito 2 ปีที่แล้ว ⁺¹
It can be a programming joke but its an anecdote that your program was slow only for one letter
@Mallchad ปีที่แล้ว
For anybody reading this in future.
Not only is it possible to store things like vertex buffer on system RAM. but it HAS to be stored on RAM before it is copied to the GPU, it's just how computers work.
What you are really controlling in graphics API is hinting when and how you would like to pass around data like vertex buffers, and Cherno accidentily set it to basically recopy the vertex buffer every draw command. I basically took 1 look at that profile, saw the high Async Copy Engine and immediately realized it was excess, repeat data copying. I had my suspicions before that but that's only because CPUs and GPUs are so fast that literally 99% of performance problems is data copying from backing memory. Calculations are often basically free in comparison to IO.
I call it a hint because the graphics API is just that, an API and the underlying OpenGL/Vulkan/D3D implimentation has an active program running in the background on CPU that governs the actual behaviour of the graphical context. and whilst you might request it to do some copying it'll just do it when it makes sense, so long as it conforms to the spec. Technically on some systems you can use DMA (Direct Memory Addressing) to write directly into the GPU buffer but this is objectively worse than having a well orchestrated graphics context manage the copy when it is sensible. Anyway.
@ibrozdemir 3 ปีที่แล้ว
hey thanks for taking the trouble of Vulkan for us man, i stopped developing with vulkan (for now i only use opengl + directx11), because i tried and saw how much it takes time to deal with even the simpleset things
@chucktrier 3 ปีที่แล้ว
the Vulkan Road is rocky at best. I am struggling with synchronization but the knowledge you gain is worth it because you know how the Gpu actually works. But super cool video.
@rohfeladyaraka8512 3 ปีที่แล้ว ⁺⁷
The thumbnails! =)
@MiklosHajma 3 ปีที่แล้ว
We all been there. It doesn't matter how experienced you are. My favourite bug hunt was when we sat in front of the screen for a day with my colleague trying to figure out why the theme colors are wrong (developed a low level UI engine at that time). Every damn piece of code was perfectly fine. Then the night fell and the whole theme changed. This was when it hit us that we were looking at the wrong dataset all the time because the theming engine did a switch at a certain time of day to compensate daylight (which is important in a navigation software). Such a facepalm moment :D
@skilz8098 3 ปีที่แล้ว
Back Face & Frustum Culling, Early Occlusion Detection, Vertex Winding Order, Depth Buffer - 2 Render Pass with Sorting for Alpha Transparency, Scene Graph Hierarchy, etc. are all important parts of the Graphics Renderer and Pipeline Stages. Furthermore, setting up a Batch Processes and Batch Manager helps significantly. What I mean from this last statement is, you don't want to have each of your models to have their own render or draw call function that becomes evoked within each frame. This will end up being slow as it will cause a bottleneck between the transferring of data from the CPU to the GPU across the Bus or PCI-Express Lanes. Instead, design a Batch Process that will group similar primitives and attributes together into a single bucket. The Batch Manager Class will be responsible for managing the priority queue of the batches. When a batch becomes too full and you try to add more data to it, then the manager class will mark it as being ready to be unloaded and it will send all of that data to the Renderer - Framebuffer - Backbuffer to be drawn. Now, the objects that you store or send to the batches may have additional information as in a priority for rendering, meaning this must be drawn first or as early as possible... Now, when you create this kind of system, there is no 1:1 ratio that fits all environments, scenes or situations. For example, one specific type of game might have say 8 different batches where each batch can store up to 10,000 vertices where a different game might have 10 batches that hold only 8,000 vertices... This is something you'll have to play around with to fine tune it. This can greatly improve your through put and reduce latency from the CPU to the GPU. Also, the GPU can store a lot of this within its own internal memory and it will handle it when it's ready to. More than just that, but also how you set up your Scene Graph is also key... If it's just a 2D Game, then things are quite easy, but if you have a full 3D Scene... then there are many different approaches, from BSP with a root node branching down from transforms to shapes, to materials, lights, etc... Another method might be a more advanced version of a BSP such as Quad or Octal Tree. Octal Trees are a little bit more involved to get initially set up, but once you have them in place and if the volume size of each octal is adjusted correctly then you might only be rendering say one to two dozen of these volumetric cubes at a time due to your view frustum, anything outside of them and those parts of the scene will never be sent to render, but will sit in a queue waiting for you to move through the scene. Once you move through the scene, then different partitions will start to render while others will go out of scope. It does require a little more calculation, but this can be done on the CPU side of things in conjunction with the Batch Renderer. You want to eliminate as many bottlenecks as possible and try to keep data flowing at a constant rate as much as possible.
Creating a fully functional Graphics Renderer - Game Engine from scratch is no easy task!
@Nickmav1337 3 ปีที่แล้ว
"it sounds like your vertex buffers are being stored in system RAM" is the best punchline to any joke I've ever heard
@jungjunk1662 3 ปีที่แล้ว
This is superb Cherno.
@GreenClover0 2 ปีที่แล้ว
To answer the question at the end, yes, videos like this are helpful :")
@FelipeMendez 3 ปีที่แล้ว ⁺²
Thank you for sharing this and most important the investigation process/method its really helpful!, now I need to get a windows pc there are no tools like that for OSX
@F1nalspace 3 ปีที่แล้ว ⁺¹
Okay thats the best kind of videos i want to see you making/doing. Having a bug, extremely hard to track down and analyze it and actually solve it. That is the most valuable thing for me - especially when using modern graphics system - such as Vulkan.
A as matter of fact, i started getting into Vulkan and even clearing the screen to a color does not work in all cases. On win32 it works perfectly, including re-creating of the swap chains - but on Linux X11 it renders but crashes on XDestroyDisplay(). Such bugs are so annoying, because they prevent you from continuing other things :-(
Also i dont understand why the validation on Linux simply does not work (No instance extension detected), but on Win32 it works just fine - both have SDK installed and its the same system (Multiboot) O_o
@nirshalmon1646 2 ปีที่แล้ว
This finally convinced me to learn Vulkan
@hahayes7205 3 ปีที่แล้ว ⁺¹
When the video doesnt immediately start with "hey what's up guys my name is the cherno" you know something is up
@badpotato 3 ปีที่แล้ว
yes, I really enjoyed.. keep making more of these
@betterfly7398 3 ปีที่แล้ว
Amazing video!
I think the idea is great. Seeing you solve a problem within your own project is a nice. Especially performance issues and stuff like that which make you use these performance tools!
Keep these coming!
PS: There are also these tools called Intel Graphics Performance Analyzers, maybe you should check them out.
@Allstreamer_ 3 ปีที่แล้ว ⁺⁴
Would love more storytime
@compsciorbust9562 2 ปีที่แล้ว
This reminds me of that time a script typo in Aliens: Colonial Marines broke the AI and no one figured out why it was bugged as hell until 5 years after release.
@delphicdescant 3 ปีที่แล้ว
I've considered switching my code to use VMA, but I haven't gotten around to it yet and my current DIY pool allocation scheme seems ok.
This is 100% something that would happen to me, though, and I'll be extra careful if I do switch lol.
Glad you found that one wrong flag instead of deciding to rewrite some large unrelated portion of your code.
@Rekongstor 3 ปีที่แล้ว
The first thing I was thinking of is not a pixel but a vertex shader. Then after looking at PCI occupancy it was definitely obvious that some buffers are stored on a CPU. I was developing a Dx12 renderer and it was like the first thing to do is to use an upload (staging) buffer.
Although I didn't use GPU trace till now. Now I know, thank you.
@taw3e8 3 ปีที่แล้ว ⁺⁵
Wasn't the GPU memory almost empty then? Noone noticed?
It's pretty scary that so small difference can have such big change... i've heard recently that someone got order of magnitude performance increase by adding some noop instructions to his functions so they would have better layout in cache (hot code in 1 cacheline) xD programming is brutal sometimes
@SianaGearz 3 ปีที่แล้ว ⁺³
Well it's almost empty one way or another. Sponza geometry is like what, 8MB worth of buffers? Textures and render targets were in VRAM anyway, they're a good chunk bigger all together.
@marsovac 11 หลายเดือนก่อน ⁺¹
Vulcan is like C. It gives you a lot of ammunition to shoot yourself in the foot. But you can also shoot things that OpenGL can't :D
@rayansattarkhan6807 3 ปีที่แล้ว
How do you search for files in the visual studio? What's that window that popped up when you were searching for the performance macro at 1:51?
@christianvogelgesang5925 3 ปีที่แล้ว
First step in optimizing should always be to open the profiler. The time is almost never lost where you are expecting it.
@BossBeneBaby 3 ปีที่แล้ว
Well i knew that uploading shaders to gpu storage is important but not that it has such an impact. Good to knows and great to see how you tackle bugs like this.
@ckjdinnj 2 ปีที่แล้ว
I just spent 15 mins (2x speed) for cherno to change a flag
…
It was still nice to learn about nSlice though. Thanks for the video
@martonkos5775 3 ปีที่แล้ว
These thumbnails are getting wilder as Cherno getting deeper in the Vulkan development.
@Gamerexde 3 ปีที่แล้ว
Ok, now vulkan looks scarier than before...
@foomoo1088 2 ปีที่แล้ว
That is possible in OpenGL with similar settings, and there’s tons of details to these rendering pipelines that all need super careful attention! It’s not exactly always an error, because sometimes you do want to set it up that way (e.g. water simulation or cloth simulation that is updating the vertices on the CPU). With these low level APIs you have to tell exactly what you want it to do
@toffeethedev 3 ปีที่แล้ว ⁺⁶
Insane story ahah, just a testiment to how useful these debugging and profiling tools are. Hope this is the kick for us to stop debugging by using printfs
@rukna3775 3 ปีที่แล้ว
haha 😎👍
@mr.mirror1213 3 ปีที่แล้ว ⁺¹
Fuck that's me
@prateekkarn9277 3 ปีที่แล้ว ⁺¹
I wish school/college would teach you to use debuggers effectively. Instead we get tested on ide with no debuggers and the only help you get is just pritnfs
@mastershooter64 2 ปีที่แล้ว
@@prateekkarn9277 sometimes even stupid things like writing code on paper in an exam
@Morimea ปีที่แล้ว
great video
welcome to Vulkan memory flags xD
complexity explosion xDD
then there goes depth-buffer flags... changing single flag you can have 100x faster performance to 100x slower
@Ikogn 2 ปีที่แล้ว
24:43 it is actually possible to allocate an OpenGL buffer in CPU memory with glBufferStorage and GL_CLIENT_STORAGE_BIT
@ValentinTaranenko 3 ปีที่แล้ว ⁺²
Great video. Vulkan is pretty interesting for me. And I think most of programming issues is a tiny stupid bug, even outside of game engine dev.
@kubic-c3186 3 ปีที่แล้ว
Four questions: Do you have any good tips for learning Vulkan? What exactly is a renderer, and what would a renderer interface look like? What defines a Vulkan "Context" in your renderer, or in other words: what does Vulkan "Context" mean?
@muhammad.m.siddiqui.mp4 3 ปีที่แล้ว ⁺³
I love how dramatic these thumbnails have gotten.
@mchughm16a4 3 ปีที่แล้ว ⁺¹
What kind of keyboard do you use? I like the sound of it
@silverqx 2 ปีที่แล้ว
New movie, Vulkan detective story 😎, but the truth is that we all was there, something similar I have experienced with RegEx, after I have dropped RegEx on the most critical parts I have gained 50% performance in a few hours, this inspired me and next week I'm going to do some perf. tunning. 🤓🚀
@MrCarburettor 3 ปีที่แล้ว
Validation layers has performance suggestions, they should have raise a flag for this one since it is one of the most common mistakes.
I totally understand your frustration!
If you ask me nsight also should have tell you to look at resources uploaded to vmem too and your app limited by it.
Great share! Thanks!
@undefBehav 3 ปีที่แล้ว ⁺¹
Vulkan is clearly toying with you at this point. My condolences to you, good sir.
@boku00 3 ปีที่แล้ว
Video is amazing, and it was amazingly helpful.
@UkkosTukki 3 ปีที่แล้ว ⁺¹
Really interesting stuff, thanks!
@pepperkake5052 9 หลายเดือนก่อน
A bit late to the party, but I got hung up in the "#type" preprocessor directive in the shader, which doesn't appear in any documentation. Is this just an addition of your own, to let the engine split one file into respective shader types before compiling them?
@user-lz2oh9zz4y 3 ปีที่แล้ว ⁺²
thanks for the large font
@jeffg4686 2 ปีที่แล้ว
Anyone happen to know if the "vulkan memory allocator" can be used to "protect" graphics memory? I remember watching a WebGPU video where the presenter mentioned that VertexBuffers might not make it in the spec because can't control the security around it (prevent from accessing beyond bounds). But was thinking lower level APIs (or more likely the web assembly host) could be used to protect GPU memory from being used maliciously. Perhaps the web assembly runtimes/hosts need a memory manager that works like hte VMA to be able to control access to memory.
@jakub7321 3 ปีที่แล้ว ⁺⁵
What GUI library do you use for Hazel?
@ricardoalcantara5846 3 ปีที่แล้ว
Dear imgui
@jamesmnguyen 3 ปีที่แล้ว
dear imgui
@mihajlosreckovic8404 3 ปีที่แล้ว
ImGui i think
@abdelhaksaouli8802 2 ปีที่แล้ว
is lighting an entity and do you have a rendering component for each model ?
@DennisMartenssonOfficial 3 ปีที่แล้ว
I've had multiple of those small things in my Vulkan renderer, fortunately never this one, I followed a tutorial while implementing mine that used staging buffers. (: Anyways - What differences did you notice when switching to the Vulkan Allocator? Just lifting the number of allocations limit? Or was there any performance differences? Implementing that allocator has been on my to do list for some time now. (:
@AntonHelm 3 ปีที่แล้ว
I would be curious to see the traces after the fix ... but congrats in finding the issue
@elirannissani914 3 ปีที่แล้ว ⁺¹
Cherno, You are amazing!
@mkvalor 3 ปีที่แล้ว
No need to fear what other terrible things might be lurking in the more complex API -- your frame time and your growing skills for how to trace and debug have got you covered.
@carlosmarques535 3 ปีที่แล้ว
Hello Cherno, great video.
At 12:25 you talk how you can't use some nsight features in non-RTX cards. Besides the whole tool being chained to Nvidia GPUs. Won't GPUOpen provide a good alternative in these cases, or at least a complement?
- Does GPUOpen replaces nsight? It does everything nsight does?
- No idea, but it runs on everything, which is better then nothing.
I recommend you checking the video on AMD TH-cam channel titled: 'AMD RDNA™ 2 - Radeon™ GPU Profiler 1.10'. It is a overview of what GPUOpen Tools can do.
@Energy0124HK 2 ปีที่แล้ว
Lesson learned, having a friend who works in EA is very helpful XDD
@aodfr 3 ปีที่แล้ว
Yeah, reading your buffers from sys memory would give you a nasty performance issues. I optimized a little pong clone simply adding in staging buffers to move the data from host memory to device memory. It nice watch a game going from 10 fps to over 200 fps by simply moving to local device memory.

ต่อไป

เล่นอัตโนมัติ

How I Made My Game Engine MUCH Faster...