A very friendly video for beginners. When I was a beginner, I was tortured by various python environments for a long time. I think the author's video can help more people learn to get in touch with AI technology. Thank you for your contribution🥰
The performace of 70B is impressive! It is very usable and can run just on battery. The detailed responses and the commentator-like analysis are really something! It's amazing what we can achieve with local hardware these days.
If you were able to run 405b on a Macbook, you should easily be able to do it on a 198GB Mac Studio M2 Ultra. I wish there was a packaged solution that you could just download and run. Nice work getting it going, and thank you for the video!
The M2 Ultra is indeed generally faster and support larger models, but it's a desktop computer, so it's not as portable as your trusty Macbook! We're working on packaging it up to make it easier to install and use. Stay tuned for updates! And thanks for watching our TH-cam video! 😆
@@PurrfectTechieTails Yes, the Studio isn't "as" portable as a laptop, but it's pretty powerful for its form factor. I'm not sure when we're going to get the 405b instruct version. And if you actually do produce a all-in-one mac solution, I think many people would be very appreciative, myself included.
Thank you for bringing up this interesting topic! MLX is the first approach that comes to mind when considering running on Apple Silicon. However, we'd be delighted to explore other options as well. It would be fascinating to compare and contrast different methods to gain a more comprehensive understanding of the available approaches. We'll share our findings here once the results are in!
A very friendly video for beginners. When I was a beginner, I was tortured by various python environments for a long time. I think the author's video can help more people learn to get in touch with AI technology. Thank you for your contribution🥰
been running the llama2 70b since Nov 2023 when I bought the M3 Max 128GB RAM. It's been awesome, can't wait to download and try llama3
🙌 That's awesome! Can't wait to hear more about your Llama 3 experience!
How is it going? I also got a massive m3 and want to try it out
The performace of 70B is impressive! It is very usable and can run just on battery. The detailed responses and the commentator-like analysis are really something! It's amazing what we can achieve with local hardware these days.
A really awesome video, guys. Massive thanks.
Wow! You got it running - thanks for sharing your journey with this. Any chance on a dedicated video on how you quantized it to fit on the MBP?
Great suggestion! We're working on quantizing the instruct model, video coming soon. 😆
If you were able to run 405b on a Macbook, you should easily be able to do it on a 198GB Mac Studio M2 Ultra. I wish there was a packaged solution that you could just download and run. Nice work getting it going, and thank you for the video!
The M2 Ultra is indeed generally faster and support larger models, but it's a desktop computer, so it's not as portable as your trusty Macbook! We're working on packaging it up to make it easier to install and use. Stay tuned for updates! And thanks for watching our TH-cam video! 😆
@@PurrfectTechieTails Yes, the Studio isn't "as" portable as a laptop, but it's pretty powerful for its form factor. I'm not sure when we're going to get the 405b instruct version. And if you actually do produce a all-in-one mac solution, I think many people would be very appreciative, myself included.
@@PurrfectTechieTails That would be awesome! Here's hoping to it happening and thanks for considering making this for the community!
What means a "2bit" version? How exactly different the versions with more bits?
Not Apple model saying Naruto will save Sakura instead of Hinata 💀
😂
Would we able to run 70b on 48gb ram m3 max or is 48gb not enough?
it takes around 40GB of memory to do it, so it will probably work. but you wont have much memory left to do anything else.
Is 128gb the absolute minimum it will run on? Or could 70b run with 32-64gb?
Nope, 128GB isn't the absolute minimum. It takes around 40GB of memory to run 70B.
@@PurrfectTechieTails Thanks! I have a 32gb macbook now but may get an M4 or M5 later with 64gb then. Would a PC with 3090 gpu be faster, or similar?
@@neilquinn I have M3 64gb Mac Studio. I think 3090 gpu would for sure better. But who knows M4 or M5~
Will I be able to run llama 8B on my MacBook air with M1 chip?
Depends on your MacBook memory. I have M1 Max with 32gb - llama 8b works in lighting speed!
Thank you!
What is the specs of your MacBook?
M3 Max Chip with 128GB memory
Oh ok. That's good. No wonder it managed to run the 405B model. Nice video btw.@@PurrfectTechieTails
Can you tell how much RAM it used when it was running?@@PurrfectTechieTails
@@wisdomyaw03 Around 120 GB 🚀
@@PurrfectTechieTails Wow. Thanks.
you two are quite cute and your walkthrough is helpful and well explained. looking forward to more tales ;)
Thanks so much! 😊
Thanks for the video. Why would I use mlx over any other approach such llama.cpp or ollama
Thank you for bringing up this interesting topic! MLX is the first approach that comes to mind when considering running on Apple Silicon. However, we'd be delighted to explore other options as well. It would be fascinating to compare and contrast different methods to gain a more comprehensive understanding of the available approaches. We'll share our findings here once the results are in!
So cool! Thank you for sharing this!
Cool, comment for support!
Are you able to run LoRA to fine-tune LLAMA?
Yes that is possible. Thanks for the video idea! 💡
@@PurrfectTechieTails Are you able to fine-tune the LLAMA model on your 128GB M3 Max?
I'm newbie on LLM, what is the difference between 4 and 8 bit?
Hello! 4-bit saves more space and is faster but with more potential accuracy loss, while 8-bit provides a good balance for many applications.
I would just buy 2 3090s to run Meta-Llama-3.1-70B at more than 10 times the speed for less money.
Still pretty expensive, add the cost of the rest of the PC on top of that :D
@@flrn84791 1200 for the cards, 1300 for a 16 core, with 128gb ram
don't forget the cost of electricity
@@flrn84791 and the monitor, and the keyboard.
It's expensive.
10 times the speed? Are you sure? Apple's memory bandwidth is not far behind the 3090's, I believe
first time seeing a tech couple channel, instant follow
ikr lol
Will this run on the m2?
Yeah it should run well.
We only tested on M3 but we believe it should be able to run too!
Sub 31 - This will be story that i will tell my Grandchildren that I was Subscriber # 31 :D
Thank you for your support ❤️
cool
Singaporean in HK?
☺️