How to Turn Your AMD GPU into a Local LLM Beast: A Beginner's Guide with ROCm

TechteamGB

มุมมอง 65 163

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 16 ม.ค. 2025

ความคิดเห็น • 168

@misterpdj 10 หลายเดือนก่อน ⁺⁴¹
Thanks for letting us know about this new release. Just tried it on my 6800xt, and it works. FYI, I think the supported list is all Navi 21 cards and all RDNA 3. That's the same list as the HIP SDK supported cards on the AMD ROCm Windows System Requirements page.
@JoseRoberto-wr1bv 9 หลายเดือนก่อน ⁺³
How much Token/s??? Using a 7B model??
@chaz-e 9 หลายเดือนก่อน
And 7600XT is not a part of the official supported list.
@misterpdj 8 หลายเดือนก่อน ⁺³
@@JoseRoberto-wr1bv On the Q8_0 version of Llama 3 I was getting 80 t/s, but for a couple of reasons the quality wasn't so good. I'm using Mixtral Instruct as my daily driver, and getting 14-18 depending on how I balance offload vs context size.
@misterpdj 8 หลายเดือนก่อน ⁺¹
@@chaz-e that and the 7600 are both gfx1102.
@jeke8413 หลายเดือนก่อน ⁺¹⁰
TH-cam algorithm is crazy. I tried to do this with my 6800xt when I first got it. After all my research, I found nothing. And five months later, here we are. I can't wait to try this out. Thank you.
@sebidev 4 หลายเดือนก่อน ⁺¹¹
Thanks you for the video, I can now use 8B large LLM models with my AMD RX 7600(8GB) and it is really fast. I use Arch Linux and it runs without any problems 👍
@puffin11 4 หลายเดือนก่อน
How did you get it to work on Linux? I've been having issues (and Ollama seems to recommend the proprietary AMD drivers....)
@sebidev 4 หลายเดือนก่อน
@@puffin11 not install amd pro drivers(proprietary). amdgpu is completely sufficient with rocm.
@whale2186 3 หลายเดือนก่อน
I was confused about buying an rtx 3060 over rx 7600 I thought ROCm was not supported on this card. How is the image generation and model training ?
@sebidev 3 หลายเดือนก่อน
@@whale2186 If you work a lot with AI models, projects, an Nvidia RTX graphics card is the best choice. AMD ROCm support is okay but unfortunately not nearly as good as the support from Nvidia CUDA and cuDNN.
@whale2186 3 หลายเดือนก่อน
@@sebidev thank you . I think I should go with 3060 or 4060 with GPU passthrough
@cj_zak1681 9 หลายเดือนก่อน ⁺⁶
brilliant! Thanks for letting us know, I am excited to try this
@pedromartins4847 10 หลายเดือนก่อน ⁺⁴
Will be trying this out later on, thank you my man.
@Dj-Mccullough 9 หลายเดือนก่อน ⁺¹⁴
It works awesome on the 6800xt. Thankyou for the guide.
@agx4035 8 หลายเดือนก่อน ⁺²
is it as fast in the video ?
@bankmanager 8 หลายเดือนก่อน ⁺⁶
@@agx4035the video accurately shows expected performance, yes.
@CapaUno1322 5 หลายเดือนก่อน ⁺¹
Just picked up a 16gb 6800, can't wait to get it installed and see what this baby can do! ;D
@Helldiver111 5 หลายเดือนก่อน
Update ?@@CapaUno1322
@WilliamGlaser-dk9nh 29 วันที่ผ่านมา
do u think it will work well on my standard rx6800
@ThePawel36 9 หลายเดือนก่อน ⁺²⁵
I've successfully utilized 70B models with 4-bit quantization on my 4070ti Super. I offload 27 out of 80 layers partially, while the remainder utilizes the RAM. It functions quite well-not exceedingly fast, but sufficiently for comfortable operation. A minimum of 64GB of RAM is required. While VRAM is significant, in reality, you can operate 70B networks with even 10GB of VRAM or less. It ultimately depends on the model's response time to your queries.
@gomesbruno201 7 หลายเดือนก่อน ⁺³
it would be nice to try on a amd equivalent. maybe 7800xt or 7900xt
@ferluisch 6 หลายเดือนก่อน ⁺¹
Maybe he can say his tok/s, comparing my 2080 vs the video (rx 7600) I get this results: I just tried this vs my 2080 (non super) and I get 62.40tok/s, which is around 40% faster for a card with around the same gaming performance, the vram usage seem a bit lower though (base was on 1.8gb and when opening the same model it was 7.2), so around 5.4gb vram usage for the model. Hopefully amd can catch up in the future :(
@VamosViverFora 2 หลายเดือนก่อน ⁺¹
Nice to know. I thought it only use VRAM xor RAM. Good to know it add up all memory available.
@oggenable 2 หลายเดือนก่อน
@@gomesbruno201 7900 XTX is around the same performance as the 4070 Ti Super.
@joshuat6124 9 หลายเดือนก่อน ⁺¹
Amazing video, I learnt a lot! I love these videos about commerical GPUs running AI/ML workloads as I'm into developing AL/ML models.
@MiguelGonzalez-nv2rt 9 หลายเดือนก่อน ⁺⁸
Its not working for me, I have a 7900xt installed and attempted the same as you but it just gets an error message with no apparent reason. Drivers up to date and everything in order but nothing
@jakeastles 6 หลายเดือนก่อน ⁺⁴
Thanks, the only good video I could find on yt which explained everything easily. Your accent helped me focus. Very useful stuff.
@TechteamGB 6 หลายเดือนก่อน
Thank you! Glad to be helpful :D
@myroslav6873 5 หลายเดือนก่อน ⁺³
Thanks, worked for me very well on my 6800xt! The answers are as quick as in the video. But I guess I need to learn how and what to ask, because the answers were always very confident and always completely wrong and made-up. I asked the chat to make a list of French kings who were married off before they were 18 yo, and it invented a bunch of Kings that never lived, and said that Emperor Napoleon Bonaparte and President Macron were both married off at 16, but they were not kings technically, and they were certainly not married at 16, lol.
@oggenable 2 หลายเดือนก่อน ⁺²
If you already have this GPU, go ahead and play with LLM's. It's a good place to get started. I started playing with a Vega 56 GPU which is rock bottom of what ROCm supports for LLM's if I understand things correctly. If LLM's is the focus and you are buying new nVidia is still the better option. An RTX 3060 w. 12GB of VRAM gives you 20% more tokens/s at 20% less price. I sometimes see used RTX 3080's at the same price point as the RX 7600 XT. You don't need all that VRAM if you don't have the compute power to back it.
@StellanHaglund หลายเดือนก่อน ⁺¹
You are aware that you are running a q4 version of the model?
Which explains the low vram
@aurimasc5333 4 หลายเดือนก่อน ⁺³
Works just fine with RX5700xt, it does respond decently fast.
@Gray-Today 20 วันที่ผ่านมา
What a fabulous collection of undefined acronyms.
@VasilijP 5 หลายเดือนก่อน ⁺²
Well, it is not like GPGPU came just with LLMs. OpenCL on AMD GPUs in 2013 and before was the most viable option for crypto mining, while Nvidia was too slow at that time due to small cache size and poor efficiency. All changed with 750ti and gtx9xx generation of cards. History of GPU programming is even longer than that as people were trying to bend even fixed pipeline GPUs to calculate something unrelated to graphics. Geforce 8 with early and limited CUDA was of course a game changer and I am a big fan of CUDA and OpenCL since then. Thanks for a great video on 7600XT! ❤
@mysticalread 6 หลายเดือนก่อน ⁺³
Can you add multiple AMDs together increasing the power?
@ols7462 9 หลายเดือนก่อน ⁺⁴
As a total dummy all things LLM your video was the catalyst I needed to entertain the idea of learning about all this AI stuff. I'm wondering and this would be a greatly appreciated video if you make it, is it possible to put this gpu to my streaming pc and it encodes and uploads stream and at the same time runs a local LLM that interacts with the chat on twitch. How can I integrate these models with my twitch streams?
@jinli801 หลายเดือนก่อน
Can you talk more about the memory efficiency of the VRAM usage between AMD and NVidia. Would like to learn more about this.
@clementajaegbu6660 3 หลายเดือนก่อน
Good vid; however the amd rocm versions of relevant files are no longer available (Link in description leads to generic lm studio versions) )? The later versions don't appear to specifically recognize AMD GPU's ?
@losttale1 8 หลายเดือนก่อน ⁺⁴
gpu not detected on rx 6800 windows 10. edit: nvm must load model first from the top center.
@CapaUno1322 5 หลายเดือนก่อน ⁺¹
Good news! ;D
@nicholasfall838 5 หลายเดือนก่อน ⁺¹
What do you mean by “first from the top center?” I couldn’t get ROCm to recognize my CLU either, but that was through WSL 2 not this app
@BigFarm_ah365 9 หลายเดือนก่อน ⁺²
Seeing as how I spent last night trying to install ROCm without any luck, nor could I find any good tutorials or a single success story, I'll be curious to see how insanely easy this is. Wait, I don't need to install and run ROCm in WSL?
@bankmanager 8 หลายเดือนก่อน ⁺¹
Hey, I've had success with ROCm on 5.7/6.0/6.1.1 on Ubuntu and 5.7 on Windows so let me know if you're still having an issue and I can probably point you in the right direction
@sailorbob74133 9 หลายเดือนก่อน ⁺²
Can you do an update when ROCm 6.1 is integrated to LM Studio?
@bankmanager 8 หลายเดือนก่อน ⁺¹
6.1 is not likely to ever be available on Windows. Need to wait for 6.2 at least.
@sailorbob74133 8 หลายเดือนก่อน
@@bankmanager Ok, thanks for the reply.
@adriwicaksono 9 หลายเดือนก่อน
07:34 Not sure if this will fix it but try unchecking the "GPU offload" box before loading the model, do tell us if it works!
@Unresolver 7 วันที่ผ่านมา
I thought it only runs on linux. Do you use WSL/Docker?
@ystrem7446 9 หลายเดือนก่อน ⁺²
Hi, does it work on RX5500 series ?
@predabot__6778 5 หลายเดือนก่อน
Alas... since this uses ROCm, and AMD does not list *any* RDNA1 cards, then the answer is almost certainly... no. You really wouldn't even want to try it though, since the RX5500 XT is a severely gimped card (not to mention the horror of the non-xt OEM-variant) - it has only 1408 shader-cores, compared to the next jump up: the RX 5600 XT's 2304 cores - that's almost a 50% cut in compute! And it has a measly 4 GB's of VRAM... that's complete murder for LLM-usage - everything will be slow as molasses. You'll lose more time and money in trying to run the model (even if it was supported), than if you just got an RX6600 - that card is the best value *still* on this market, so if you want a cheap entry-level card to try this out, I would recommend that.
@CapaUno1322 5 หลายเดือนก่อน ⁺¹
When you have a LLM on your machine, can it still access the internet for information? Just thinking aloud? Thanks, subbed! ;D
@deanmakovic3849 4 หลายเดือนก่อน ⁺¹
Turn of internet and see what would happened :)
@antdr01d หลายเดือนก่อน ⁺¹
Yes you can toggle the web search feature for latest data
@fretbuzzly 7 หลายเดือนก่อน ⁺²
This is cool, but I have to say that I'm running Ollama with OpenWebUi and a 1080Ti and I get similarly quick responses. I would assume a newer card would perform much better, so I'm curious where the performance of the new cards really matters for just chatting, if at all.
@leucome 7 หลายเดือนก่อน ⁺¹
If you add a voice generation then it matter a lot. With no voice anything over 10 token sec is pretty usable.
@WilliamGlaser-dk9nh 29 วันที่ผ่านมา ⁺¹
hope this works with my rx6800
@dava00007 8 หลายเดือนก่อน ⁺¹
How is it doing with image generation?
@dougf6126 9 หลายเดือนก่อน ⁺³
AM I required to install AMD HIP SDK for Windows first before I can use LLM studio?
@bankmanager 8 หลายเดือนก่อน ⁺²
Yes.
@ThisIsMMI 4 หลายเดือนก่อน ⁺¹
Can rx 570 8gb variant support ROCm?
@rdsii64 4 หลายเดือนก่อน
Are any of these models that we can run locally uncensored/unrestricted?
@gershon9600 หลายเดือนก่อน
officially this is for 6000 and 7000 series only atm on windows
@robertmiller1638 5 หลายเดือนก่อน
Incredible video!
@casius00 7 หลายเดือนก่อน
Great video. Worked for me on the first try. Is there a guide somewhere on how to limit/configure a model?
@RobertoMaurizzi 3 หลายเดือนก่อน
If there's a Windows driver for ROCm how come PyTorch still only show ROCm available for Linux?
Anyway good to know they works, I'd like to buy a new system dedicated to LLM/Diffusion tasks and your is the first confirmation it actually works as intended 😅
@ferluisch 6 หลายเดือนก่อน
Can you do a comparison vs cuda?
@dogoku 9 หลายเดือนก่อน ⁺¹
Can we use it to generate images as well (like mid journey or dall-e) or does it work only for text?
@Medeci 9 หลายเดือนก่อน
yeah, on linux with SD
@Dj-Mccullough 10 หลายเดือนก่อน ⁺¹
I've been looking to make a dedicated AI machine with an LLM. i have a shelf bound 6800xt that has heat issues sustaining gaming, (have repasted, i think is partially defective) i didnt want to throw it away, Now i know i can repurpose it.
@studiomusicflow4644 4 หลายเดือนก่อน ⁺²
Does anyone know of a way to make an RX 580 run with ROCm on Windows? Yes, it's old, but it would be better than using the processor to play with A.I. and there are plenty of RX580s out there.
@Beauty.and.FashionPhotographer 6 หลายเดือนก่อน
can you teach how to do LIMs, Large Image Models ?
@CodeCube-rv1rm 2 หลายเดือนก่อน ⁺²
"If you've used an AMD GPU for compute work, you'll know that's not great"
Bruh that Pugetbench score shows the RX 7900 XTX getting 92.6% of the RTX 4090's performance and it has the same amount of VRAM for at least £700 less. 💀💀
@alvarodavidhernandezameson2480 8 หลายเดือนก่อน
I would like to see how it performs with a graphics card, rx 7600 standard version.
@crypto_que 5 หลายเดือนก่อน
Today, I finally jumped off the AMD Struggle Bus, and installed an NVIDIA GPU that runs AI like boss. Instead of waiting SECONDS for two AMD GPUs to SHARE 8GB of memory via torch and pyenv and BIFURCATION software…
My RTX 4070 Super just does the damn calculations right THE FIRST TIME!
@duality4y 3 หลายเดือนก่อน
what about multiple 7600 cards
@mclab33 5 หลายเดือนก่อน ⁺¹
I'll try it in a few hours with the 780M iGPU and let you know
@mclab33 3 หลายเดือนก่อน ⁺¹
Not working!
@CP-oo8mj 5 หลายเดือนก่อน
you couldn't load 30B parameter one because in your settings your trying to offload all layers to your GPU. Play with the setting and try reducing the GPU offload to find your sweet spot.
@safayatjamil2719 3 หลายเดือนก่อน
ZLUDA is available again btw
@barderino5673 9 หลายเดือนก่อน ⁺¹
wait 30bilion parameter model are fine with GGUF and 16gb even with 12 is something that im missing ?
@sean58271 11 วันที่ผ่านมา
Quantization...which decreases the quality of the response. Not really worth it in my opinion
@barderino5673 11 วันที่ผ่านมา ⁺¹
@sean58271 i dont even know why i said that 30B can fit in 16GB of VRAM and surely not with 12 ...also quantization it's fine ...like the difference from a 8B model at f16 to 8 nothing change basically from 4 there is a loss but the bigger you go the less noticeable
@LLlblKAPHO 4 หลายเดือนก่อน
How it work with laptops? We have 2 GPU, small and large and llama studio turn on small gpu(
@eliann124 9 หลายเดือนก่อน
Nice Miniled !
@Rick-bn5bs 13 วันที่ผ่านมา
Can I run kobold ai on a rx 7800xt? A 13B model with 4-bit quantization? Currently using a 12GB 3060 and it has been a great card overall. But nvidia being the as* they're they won't increase the vram size even if they double the price of the same series cards, rather lower it. So I'm planning on switching sides.
@XaraseGod 2 หลายเดือนก่อน
please someone tell me how to make this 7600xt work normally with stable diffusion
@Machistmo 4 หลายเดือนก่อน
I have a 6800XT, 6900XT and a 7900XT. I will attempt this on each.
@StephenConnolly67 7 หลายเดือนก่อน
Would this work with Ollama?
@antdr01d หลายเดือนก่อน
Yes
@MsgForce 8 หลายเดือนก่อน
I asked If it can generate a qr cod for me and it faild.
@gershon9600 หลายเดือนก่อน
it says runtime isnt supported
@sailorbob74133 9 หลายเดือนก่อน
Can I do anything useful on the phoenix NPU? Just bought a Phoenix laptop.
@HypoCT 3 หลายเดือนก่อน
Can you try Ollama with the this rocm thing? I've been splitting my head trying to get it to work with 6800xt
@ManjaroBlack 3 หลายเดือนก่อน
Ollama doesn’t work with ROCm. It is for nvidia and Apple silicon only.
@antdr01d หลายเดือนก่อน
@@ManjaroBlackTalking out your hairy buttocks
@zeeweenor หลายเดือนก่อน
@@ManjaroBlack rocM is an amd project LIKE AMD CPUS AND GPUS are you high?
@ul6633 7 หลายเดือนก่อน
mine isnt using the GPU, it still uses the cpu. 6950xt
@user-iq5iu2tt2h หลายเดือนก่อน
WHAT IF IM ON WINDOWS?
@jaiderariza1292 2 หลายเดือนก่อน
will be good if also create a video for open-webui + AMD
@steven7297 หลายเดือนก่อน
how tf do you download llama its so weird
@Ittorri 9 หลายเดือนก่อน ⁺¹
So I asked the ai what it recommends if I want to upgrade my pc and it recommended RX 8000 XT💀
@smert_rashistskiy_pederacii 8 หลายเดือนก่อน ⁺¹
Is there rx-580 support, who knows for sure? (it's not on the list of ROCm that's why I'm asking) or at list does it work with RX6600M 'cause I see in compatible list only RX6600XT.
@predabot__6778 5 หลายเดือนก่อน ⁺¹
The RX6600M is the same chip as the RX6600 (Navi23), just with a different vbios - and since Navi23XT (RX6600XT-6650XT) is simply the full die, without cutting, then it should work on the RX6600M - same chip, just a bit cut down.
(not a bad bin though - it's a good bin, with a higher base-clock than desktop RX6600, even, but shaders cut on purpose, to improve efficiency. I.e, desktop RX6600's are failed bins of RX6600XT's, whom are then cut down to justify their existence - laptop RX6600M, are some of the best 6600XT's but cut on purpose to save power)
@IntenseGrid 2 หลายเดือนก่อน
I'd like you to try out an 8700G with fast ram to run LLMs. Also please run Linux.
@foxap3209 หลายเดือนก่อน
Would be interesting to see if the npu of those CPU could be usable
@ferluisch 6 หลายเดือนก่อน
I just tried this vs my 2080 (non super) and I get 62.40tok/s, which is around 40% faster for a card with around the same gaming performance, the vram usage seem a bit lower though (base was on 1.8gb and when opening the same model it was 7.2), so around 5.4gb vram usage for the model. Hopefully amd can catch up in the future :(
@WolfgangWeidner 7 หลายเดือนก่อน
Amazing. The 7600(xt) is not even officially supported in AMDs ROCM software.
@jacobtinkle9686 3 หลายเดือนก่อน ⁺¹
calling a 350-370€ grafics card "budget" is kinda weird ngl
@antdr01d หลายเดือนก่อน
Nvidia said so, pal.
@CapaUno1322 5 หลายเดือนก่อน ⁺¹
Shouldn't you be at the Olympics? Maybe you are! 😅
@duality4y 3 หลายเดือนก่อน ⁺¹
How on earth can these cards be cheaper then NVIDIA I think I'll never buy NVIDIA again ...
@padlandominic4434 12 วันที่ผ่านมา
its working in my 6700xt tnx
@playerplays1658 10 วันที่ผ่านมา
Hey man did you find the download link to that and rocm ? cause it's just giving me the normal one
@Cjw9000 9 หลายเดือนก่อน ⁺¹
Chat GPT 3.5 has about 170B parameters and I heard that Chat GPT 4 is a MoE with 8 times 120B parameters, so effectively 960B parameters that you would have to load into vram.
@Larimuss 2 หลายเดือนก่อน ⁺¹
Let me know when amd can run diffusi0n models quicker than CPUs 😢
@dead_protagonist 9 หลายเดือนก่อน ⁺⁴⁷
you should mention that ROCm only supports... three... AMD gpu's
@user-hq9fp8sm8f 9 หลายเดือนก่อน ⁺¹¹
More than 3
@dead_protagonist 9 หลายเดือนก่อน
@@user-hq9fp8sm8f source
@arg0x- 9 หลายเดือนก่อน ⁺³
@@user-hq9fp8sm8f does it support RX 5600 XT?
@user-hq9fp8sm8f 9 หลายเดือนก่อน
@@arg0x- no
@highpraise-highcritic 8 หลายเดือนก่อน ⁺²⁸
@dead_protagonist .. you should mention you don't know what you are talking about and/or didn't read the compatibility supported/unsupported gpu list ...
or ... maybe you just can't count ¯\_(ツ)_/¯
@ZachariasDaisy-b6d 3 หลายเดือนก่อน
Hintz Summit
@SteveWray 29 วันที่ผ่านมา
Nice video. I just wish they didn't change the UI so much in the space of 8 months that your instructions and information are completely useless :( Literally nothing in their UI is the same now.
@2002budokan วันที่ผ่านมา
Buy a NVIDIA card and be happy.
@SBoth_ 5 หลายเดือนก่อน
I can't set my 7900xtx to roc. Only options are Vulkan
@seeibe 5 หลายเดือนก่อน
Clickbait title I suppose? Just because you can run local LLMs, doesn't mean the GPU plays in the same league as nvidia consumer GPUs (4090)
@大支爺 4 หลายเดือนก่อน ⁺¹
ROCm still very sux today.
@marconwps 7 หลายเดือนก่อน
Zluda :)
@XaraseGod 2 หลายเดือนก่อน
NEVER BUY AMD
@stonedoubt 8 หลายเดือนก่อน
How do you get the AMD out of your throat? Just wondering since I’ve never seen anyone gobble so hard…
@antdr01d หลายเดือนก่อน
Turn off your screen
@newdawn005 10 หลายเดือนก่อน ⁺²
sad that there is only advertising here, an amd gpu is bad - where is the video about the problems of an amd gpu ?
@bobdickweed 10 หลายเดือนก่อน ⁺²
i have had AMD GPUs for the past 14 years . never a problem , im on the 7900xtx now and i works great for what i do
@Unclesam404 9 หลายเดือนก่อน ⁺¹¹
Amd is improving software in lighting speed. So what are you smoking ? Why Amd gpu can not do GPGPU with good software ?
@dansanger5340 8 หลายเดือนก่อน ⁺²
Not everyone can afford a 4090 GPU. AMD seems like a better value, at the cost of a little extra effort.
@DandyDude 2 หลายเดือนก่อน
Got anything as good for image generation?

ต่อไป

เล่นอัตโนมัติ

Local LLM Challenge | Speed vs Efficiency