How to Turn Your AMD GPU into a Local LLM Beast: A Beginner's Guide with ROCm
ฝัง
- เผยแพร่เมื่อ 26 มิ.ย. 2024
- RX 7600 XT on Amazon (affiliate): locally.link/kEJG
LM Studio: lmstudio.ai/rocm
Products provided by Gigabyte
Those of us with NVIDIA GPUs, particularly ones with enough VRAM, have been able to run large language models locally for quite a while. I did a guide last year showing you how to run Vicuna locally, but that really only worked on NVIDIA GPUs. Support has improved a little, including running it all on your CPU instead, but with the launch of AMD’s ROCm software, it’s now not only possible to run large language models locally, but insanely easy. Now if you are just here for the guide, skip here. If you want to know how this works so damn well, stick around!
BUY AN OSRTT (Open Source Response Time Tool): osrtt.com
Become a TH-cam Member and get sponsor free videos and access to our private Discord chat: th-cam.com/users/techteamgbjoin
Locally Links - Global Short Linking for Creators: locally.link/techteamgb
Use referral code "techteamgb20" when signing up!
Patreon: / techteamgb
Donations: streamlabs.com/techteamgb
OverclockersUK Affiliate link: techteamgb.co.uk/ocuk
Discord! / discord
As an Amazon Associate I earn from qualifying purchases, using the links below or other Amazon affiliate links here.
Want a cool T-Shirt or hoodie? teespring.com/en-GB/stores/te...
Private Internet Access (VPN): techteamgb.co.uk/PIA
NordVPN: nordvpn.org/techteamgb
HUMBLE BUNDLE: www.humblebundle.com/monthly?...
Zen Internet - My (UK) ISP: locally.link/b3DC
WEB HOSTING: techteamgb.co.uk/domaincom
Check out our awesome website! techteamgb.co.uk
Bitcoin donations: 1PqsJeJsDbNEECjCKbQ2DsQxJWYqmTvt4E
My Monitor - Philips EVNIA 8600 QD-OLED: locally.link/H33q
- My PC - AMAZON AFFILIATE LINKS -
5900X on Amazon: techteamgb.co.uk/5900x
32GB Corsair 3000MHz on Amazon: techteamgb.co.uk/cors32gbrgb
GBT X570 Master on Amazon: techteamgb.co.uk/gbtx570master
RTX 2080 on Amazon: techteamgb.co.uk/RTX2080
H100i Pro on Amazon: techteamgb.co.uk/h100ipro
WD 1TB NVME on Amazon: techteamgb.co.uk/wdbl1tb
Patriot VP4100 on Amazon: techteamgb.co.uk/vp4100
Sabrent Rocket Q 4TB on Amazon: techteamgb.co.uk/rocketq4
If you are interested in contacting us, then please email: inbox@techteamgb.com and we will respond as soon as possible. - วิทยาศาสตร์และเทคโนโลยี
brilliant! Thanks for letting us know, I am excited to try this
Will be trying this out later on, thank you my man.
Amazing video, I learnt a lot! I love these videos about commerical GPUs running AI/ML workloads as I'm into developing AL/ML models.
It works awesome on the 6800xt. Thankyou for the guide.
is it as fast in the video ?
@@agx4035the video accurately shows expected performance, yes.
I've successfully utilized 70B models with 4-bit quantization on my 4070ti Super. I offload 27 out of 80 layers partially, while the remainder utilizes the RAM. It functions quite well-not exceedingly fast, but sufficiently for comfortable operation. A minimum of 64GB of RAM is required. While VRAM is significant, in reality, you can operate 70B networks with even 10GB of VRAM or less. It ultimately depends on the model's response time to your queries.
it would be nice to try on a amd equivalent. maybe 7800xt or 7900xt
Thanks for letting us know about this new release. Just tried it on my 6800xt, and it works. FYI, I think the supported list is all Navi 21 cards and all RDNA 3. That's the same list as the HIP SDK supported cards on the AMD ROCm Windows System Requirements page.
How much Token/s??? Using a 7B model??
And 7600XT is not a part of the official supported list.
@@JoseRoberto-wr1bv On the Q8_0 version of Llama 3 I was getting 80 t/s, but for a couple of reasons the quality wasn't so good. I'm using Mixtral Instruct as my daily driver, and getting 14-18 depending on how I balance offload vs context size.
@@chaz-e that and the 7600 are both gfx1102.
Great video. Worked for me on the first try. Is there a guide somewhere on how to limit/configure a model?
Nice Miniled !
Is there rx-580 support, who knows for sure? (it's not on the list of ROCm that's why I'm asking) or at list does it work with RX6600M 'cause I see in compatible list only RX6600XT.
wait 30bilion parameter model are fine with GGUF and 16gb even with 12 is something that im missing ?
Hi, does it work on RX5500 series ?
Can I do anything useful on the phoenix NPU? Just bought a Phoenix laptop.
Its not working for me, I have a 7900xt installed and attempted the same as you but it just gets an error message with no apparent reason. Drivers up to date and everything in order but nothing
I would like to see how it performs with a graphics card, rx 7600 standard version.
How is it doing with image generation?
Can we use it to generate images as well (like mid journey or dall-e) or does it work only for text?
yeah, on linux with SD
As a total dummy all things LLM your video was the catalyst I needed to entertain the idea of learning about all this AI stuff. I'm wondering and this would be a greatly appreciated video if you make it, is it possible to put this gpu to my streaming pc and it encodes and uploads stream and at the same time runs a local LLM that interacts with the chat on twitch. How can I integrate these models with my twitch streams?
Can you add multiple AMDs together increasing the power?
i see you downloading the windows exe for rocm lm studio.. how in the hell are you running that exe? i dont see you using a wine prompt.
I've been looking to make a dedicated AI machine with an LLM. i have a shelf bound 6800xt that has heat issues sustaining gaming, (have repasted, i think is partially defective) i didnt want to throw it away, Now i know i can repurpose it.
07:34 Not sure if this will fix it but try unchecking the "GPU offload" box before loading the model, do tell us if it works!
Would this work with Ollama?
gpu not detected on rx 6800 windows 10. edit: nvm must load model first from the top center.
AM I required to install AMD HIP SDK for Windows first before I can use LLM studio?
Yes.
This is cool, but I have to say that I'm running Ollama with OpenWebUi and a 1080Ti and I get similarly quick responses. I would assume a newer card would perform much better, so I'm curious where the performance of the new cards really matters for just chatting, if at all.
If you add a voice generation then it matter a lot. With no voice anything over 10 token sec is pretty usable.
RX 7600 XT or RX 6750 XT for LLM ? On Windows.
Can you do an update when ROCm 6.1 is integrated to LM Studio?
6.1 is not likely to ever be available on Windows. Need to wait for 6.2 at least.
@@bankmanager Ok, thanks for the reply.
How do I install ROCm software ? I’m at the website but when I download it, all it does it delete my adrenaline drivers…. Do I need the pro software to run ROCm? I still wanna game on my pc too
No, you don't need the pro drivers.
@@bankmanager how can I install ROCm ?
Seeing as how I spent last night trying to install ROCm without any luck, nor could I find any good tutorials or a single success story, I'll be curious to see how insanely easy this is. Wait, I don't need to install and run ROCm in WSL?
Hey, I've had success with ROCm on 5.7/6.0/6.1.1 on Ubuntu and 5.7 on Windows so let me know if you're still having an issue and I can probably point you in the right direction
I asked If it can generate a qr cod for me and it faild.
mine isnt using the GPU, it still uses the cpu. 6950xt
Chat GPT 3.5 has about 170B parameters and I heard that Chat GPT 4 is a MoE with 8 times 120B parameters, so effectively 960B parameters that you would have to load into vram.
Amazing. The 7600(xt) is not even officially supported in AMDs ROCM software.
So I asked the ai what it recommends if I want to upgrade my pc and it recommended RX 8000 XT💀
Zluda :)
you should mention that ROCm only supports... three... AMD gpu's
More than 3
@@user-hq9fp8sm8f source
@@user-hq9fp8sm8f does it support RX 5600 XT?
@@arg0x- no
@dead_protagonist .. you should mention you don't know what you are talking about and/or didn't read the compatibility supported/unsupported gpu list ...
or ... maybe you just can't count ¯\_(ツ)_/¯
How do you get the AMD out of your throat? Just wondering since I’ve never seen anyone gobble so hard…
sad that there is only advertising here, an amd gpu is bad - where is the video about the problems of an amd gpu ?
i have had AMD GPUs for the past 14 years . never a problem , im on the 7900xtx now and i works great for what i do
Amd is improving software in lighting speed. So what are you smoking ? Why Amd gpu can not do GPGPU with good software ?
Not everyone can afford a 4090 GPU. AMD seems like a better value, at the cost of a little extra effort.