host ALL your AI locally

Using Clusters to Boost LLMs 🚀

All You Need To Know About Running LLMs Locally

สาวยากจนเสนอว่าจะดูแลพนักงานเสิร์ฟ โดยไม่รู้ว่าเขาเป็นคนรวยที่สุดในโลก

หลวงพ่อเท่ง ออกบวชให้พรญาติโยมของจริง ไม่ใช่เล่นหนัง!!

[#2024MAMA] BIGBANG (빅뱅) - 뱅뱅뱅 (BANG BANG BANG) + FANTASTIC BABY | Mnet 241123 방송

How To Run Llama 3.1: 8B, 70B, 405B Models Locally (Guide)

School of Machine Learning

มุมมอง 11 418

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 25 พ.ย. 2024

ความคิดเห็น • 14

@ecchiRhino99 4 หลายเดือนก่อน ⁺¹²
good, I will now try to run the 405B model on my 50,000$ pc.
@DCEntropy 2 หลายเดือนก่อน
Is even a 50k PC fast enough? ;-)
@JonVB-t8l 3 หลายเดือนก่อน ⁺⁵
You can run the 405b model on a server with at least 256gb of ram and a vega 56 or 64 graphics card.
That specific card can access ram as if it were vram and on some platforms, bypass the CPU.
You can also buy optane pm for cheap. Either adding 4x4 optane modules via pcie then raid 0 and add the raid to swap, or as ddr4 dims to just add as a massive ram pool.
I'll be experimenting with Vega 20 next week
@SchoolofMachineLearning 3 หลายเดือนก่อน ⁺¹
I've not tried that but I feel it will be too slow or not usable. Let me know how it goes for you.
@JonVB-t8l 3 หลายเดือนก่อน
@@SchoolofMachineLearning I have confirmed A- it is slow, but B- it works.
Still waiting for my MI60 to come in and I decided to get a Cascade Lake Xeon Gold system that supports Optane Pmem. Gonna post progress on level1 forums. Right now the problem is memory access time is a major issue. If you want accuracy and don't care about token's per sec, or you just want to tinker, a single Vega 56 is not as bad as you might think. 10sec to first word and that is commendable for such a large model.
Don't do any jail-break prompts or it strait locks up. No idea why.
@deepaksingh9318 3 หลายเดือนก่อน ⁺³
is there a single source of Information which could give details hardware requirements for each of Llama 3.5 Models (i.e. GPU, RAM , Memory , Cache Memory etc.)
@SchoolofMachineLearning 3 หลายเดือนก่อน ⁺³
I've posted extensive details. The link is in the description box. Meta doesn't officially release any hardware requirements afaik.
3 หลายเดือนก่อน
Thanks a lot for the detailed explanation in the video! I have a question regarding Ollama. Is it possible to use Ollama and the models available on it in a production environment? I would love to hear your thoughts or any experiences you might have with it. Thank you!
@SchoolofMachineLearning 3 หลายเดือนก่อน ⁺¹
It's not recommended to use in a production environment, it's aimed more at consumer hardware than a production hardware.
@gu9838 4 หลายเดือนก่อน ⁺⁷
um 405b is like 300 gigs good luck with THAT lol
@SchoolofMachineLearning 3 หลายเดือนก่อน
definitely not for the average user :P
@ouso3335 3 หลายเดือนก่อน
Have u tried 405b version localy ? What are ur pc specs ?
@ouso3335 3 หลายเดือนก่อน
Have u guys tried 405b version localy ? What are ur pc specs ?
@SchoolofMachineLearning 3 หลายเดือนก่อน
Most people prefer to run in the cloud as the requirements for a local PC would be incredibly high. Also would recommend running a quantised version.

ต่อไป

เล่นอัตโนมัติ

host ALL your AI locally

host ALL your AI locally

Using Clusters to Boost LLMs 🚀

Using Clusters to Boost LLMs 🚀

All You Need To Know About Running LLMs Locally

All You Need To Know About Running LLMs Locally

สาวยากจนเสนอว่าจะดูแลพนักงานเสิร์ฟ โดยไม่รู้ว่าเขาเป็นคนรวยที่สุดในโลก

สาวยากจนเสนอว่าจะดูแลพนักงานเสิร์ฟ โดยไม่รู้ว่าเขาเป็นคนรวยที่สุดในโลก

หลวงพ่อเท่ง ออกบวชให้พรญาติโยมของจริง ไม่ใช่เล่นหนัง!!

หลวงพ่อเท่ง ออกบวชให้พรญาติโยมของจริง ไม่ใช่เล่นหนัง!!

[#2024MAMA] BIGBANG (빅뱅) - 뱅뱅뱅 (BANG BANG BANG) + FANTASTIC BABY | Mnet 241123 방송

[#2024MAMA] BIGBANG (빅뱅) - 뱅뱅뱅 (BANG BANG BANG) + FANTASTIC BABY | Mnet 241123 방송

From Small To Giant 0%🍫 VS 100%🍫 #katebrush #shorts #gummy

From Small To Giant 0%🍫 VS 100%🍫 #katebrush #shorts #gummy

Cheap mini runs a 70B LLM 🤯

Cheap mini runs a 70B LLM 🤯

Using Llama-3.1-405B as a Coding Assistant with Continue.Dev, Ollama, and NVIDIA GH200 Superchip

Using Llama-3.1-405B as a Coding Assistant with Continue.Dev, Ollama, and NVIDIA GH200 Superchip

Mark Zuckerberg on Llama 3.1, Open Source, AI Agents, Safety, and more

Mark Zuckerberg on Llama 3.1, Open Source, AI Agents, Safety, and more

"I want Llama3.1 to perform 10x with my private knowledge" - Self learning Local Llama3.1 405B

"I want Llama3.1 to perform 10x with my private knowledge" - Self learning Local Llama3.1 405B

Llama3: Comparing 8B vs 70B Parameter Models - Which One is Right for You?

Llama3: Comparing 8B vs 70B Parameter Models - Which One is Right for You?

LocalAI LLM Testing: How many 16GB 4060TI's does it take to run Llama 3 70B Q4

LocalAI LLM Testing: How many 16GB 4060TI's does it take to run Llama 3 70B Q4

LLAMA 3.1 70b GPU Requirements (FP32, FP16, INT8 and INT4)

LLAMA 3.1 70b GPU Requirements (FP32, FP16, INT8 and INT4)

Run Any Local LLM Faster Than Ollama-Here's How

Run Any Local LLM Faster Than Ollama—Here's How

It’s over…my new LLM Rig

It’s over…my new LLM Rig

🔴Live : เกาะติดนับคะแนนเลือกตั้งนายก อบจ.อุดรธานี "เพื่อไทย VS ประชาชน" : Matichon TV

🔴Live : เกาะติดนับคะแนนเลือกตั้งนายก อบจ.อุดรธานี "เพื่อไทย VS ประชาชน" : Matichon TV

Part I | 美女们被人贩子绑去，一招声东击西成功自救！

Part I | 美女们被人贩子绑去，一招声东击西成功自救！

龟兔赛跑！#火影忍者 #佐助 #家庭

龟兔赛跑！#火影忍者 #佐助 #家庭

แครี่คุณภาพแห่งวงการ ROV

แครี่คุณภาพแห่งวงการ ROV

要抢人，问过我们没有！？#最温柔男医生

要抢人，问过我们没有！？#最温柔男医生

Thank you Santa

Thank you Santa

ปาฎิหาริย์ไม่มีจริง R.I.P. คุณแม่หน้านิ่ง | อีจัน EJAN

ปาฎิหาริย์ไม่มีจริง R.I.P. คุณแม่หน้านิ่ง | อีจัน EJAN

#SAVEอโมริมผีเจ๊าอิปสวิชสุดกร่อยหงส์พลิกชีวิตรัวแซงนักบุญ | 3ซี้ขยี้บอล | EP.13 | Siamsport

#SAVEอโมริมผีเจ๊าอิปสวิชสุดกร่อยหงส์พลิกชีวิตรัวแซงนักบุญ | 3ซี้ขยี้บอล | EP.13 | Siamsport