Hey, I love the channel. I started making my own setup a few months ago and came across this when researching docker. New to programing environment and ubuntu. I come from the industrial automation side PLC's and networking. I am going to do a fresh install following your guide on a Node I am setting up for practice.
@RoboTFAI exactly. I got my primary system set up after a month of configuring. I learned a lot about Linux in the process through troubleshooting, and I think 5 install and build from the ground up. This will be perfect for the secondary setup I want to connect to. I will get you a coffee for the time and energy, lol
Wow! I went through the entire series of three episodes you made. I am not even a coder, I am very new to llms and ai, and I watched through all that you taught here, because I was thinking of buying AMD cards due their usually larger vrams and price. I have to say, your patience and style of guiding is very noob friendly. I really thankyou for these videos. If you get time, I have few noob questions for you, if you can answer - - I am still confused how well will 7900xtx perform in terms of token speed in this Ubuntu server and rcom for 7,13 & 32B at Q8 or FP16 respectively, for inference and as well for training. No specific numbers I could find. Any guess? - How many GPUs do you suggest for training 32b models? - Which GPUs would you suggest going for if not 7900xtx? - The 'gpt 4' which you were running during setup, was it a 13b model? I somehow missed that. - Would you suggest using 13b or lower models for coding and good problem solving purposes for local office team use? - What's your take on deepseek or qwent for similar use? Thanks a ton again. God bless.
Appreciated! Everyone has to start someplace, and it can be VERY complex if you don't have a background in running/building systems. We have to be able to learn from each other as a community. I don't know it all but happy to share what I do know and my own learning experiences with others. I don't have a 7900XT in the lab but we do have a 7600XT - you can see how it handled itself in the Mistral Leaderboard series here: th-cam.com/video/uwHiOtvm9go/w-d-xo.html There is also several amazing viewers using the 7900xt that might be able to chime in with more info. As many as possible - training is a way different beast than just running inference, way different I mainly suggest Nvidia based cards just due to the software and integrations are well ahead of Intel/AMD - but that gap is closing, and you pay a premium with Nvidia. Checkout the leaderboard on our site for more info on the cards we have tested: robotf.ai/Mistral_7B_Leaderboard and use some of the info so help with your decisions. The LocalAI AIO images that come with those "GPT-4" etc labeled models are actually like Hermes pro/etc - LocalAI lists the "mappings" here localai.io/basics/container/#all-in-one-images just to get people started. I prefer to use 32B+ (I run mostly 70B models for coding/etc if not larger when possible) at high quants (Q6-Q8) and large context sizes (64-128k) which great effects your memory requirements. I have been a huge fan of Deepseek, and Qwen even well before R1 that is getting so much traction these days. Different models are better at different tasks and why we want as many open source models as much as possible! The Deepseek coder models have been a staple in my lab for last few years for sure.
Hey, I love the channel. I started making my own setup a few months ago and came across this when researching docker. New to programing environment and ubuntu. I come from the industrial automation side PLC's and networking. I am going to do a fresh install following your guide on a Node I am setting up for practice.
Much appreciated! Hope you find it all a good starting place to get up and going quickly, getting started is always the hardest part.
@RoboTFAI exactly. I got my primary system set up after a month of configuring. I learned a lot about Linux in the process through troubleshooting, and I think 5 install and build from the ground up. This will be perfect for the secondary setup I want to connect to. I will get you a coffee for the time and energy, lol
Wow! I went through the entire series of three episodes you made. I am not even a coder, I am very new to llms and ai, and I watched through all that you taught here, because I was thinking of buying AMD cards due their usually larger vrams and price. I have to say, your patience and style of guiding is very noob friendly. I really thankyou for these videos.
If you get time, I have few noob questions for you, if you can answer -
- I am still confused how well will 7900xtx perform in terms of token speed in this Ubuntu server and rcom for 7,13 & 32B at Q8 or FP16 respectively, for inference and as well for training. No specific numbers I could find. Any guess?
- How many GPUs do you suggest for training 32b models?
- Which GPUs would you suggest going for if not 7900xtx?
- The 'gpt 4' which you were running during setup, was it a 13b model? I somehow missed that.
- Would you suggest using 13b or lower models for coding and good problem solving purposes for local office team use?
- What's your take on deepseek or qwent for similar use?
Thanks a ton again. God bless.
Appreciated! Everyone has to start someplace, and it can be VERY complex if you don't have a background in running/building systems. We have to be able to learn from each other as a community. I don't know it all but happy to share what I do know and my own learning experiences with others.
I don't have a 7900XT in the lab but we do have a 7600XT - you can see how it handled itself in the Mistral Leaderboard series here: th-cam.com/video/uwHiOtvm9go/w-d-xo.html There is also several amazing viewers using the 7900xt that might be able to chime in with more info.
As many as possible - training is a way different beast than just running inference, way different
I mainly suggest Nvidia based cards just due to the software and integrations are well ahead of Intel/AMD - but that gap is closing, and you pay a premium with Nvidia. Checkout the leaderboard on our site for more info on the cards we have tested: robotf.ai/Mistral_7B_Leaderboard and use some of the info so help with your decisions.
The LocalAI AIO images that come with those "GPT-4" etc labeled models are actually like Hermes pro/etc - LocalAI lists the "mappings" here localai.io/basics/container/#all-in-one-images just to get people started.
I prefer to use 32B+ (I run mostly 70B models for coding/etc if not larger when possible) at high quants (Q6-Q8) and large context sizes (64-128k) which great effects your memory requirements.
I have been a huge fan of Deepseek, and Qwen even well before R1 that is getting so much traction these days. Different models are better at different tasks and why we want as many open source models as much as possible! The Deepseek coder models have been a staple in my lab for last few years for sure.