Jason, I absolutely love your content! Please continue with this straightforward and informative style of delivering knowledge. It’s so refreshing compared to the overly exaggerated or attention-grabbing approaches out there.
I am running deepseek-r1:14b-qwen-distill-q8_0 variant locally (Ollama on Kubuntu 24.04) on my cheap ASRock DeskMeet X600 PC WITHOUT dedicated GPU. My AMD Ryzen 5 8600G has 16 TOPS only and a 65 watts power limit. I have 64GB RAM which can be FULLY USED for inference. Inference is slow. Highly complex tasks (prompts) are running up to 5 minutes but even writing a well-structured prompt takes me more time. And the result saves me hours of work. The PC supports up to 128GB RAM, therefore running a 70B model should work perfectly when time is no issue. Due to the low power consumption there are no heat problems. So you trade speed against nearly unlimited model size - for me that is the perfect solution, especially considering that this is a
@@electric7309 Yes. But have you ever tried to speak to someone who is much slower than you in way they talk ? Not ones some sickness, just freaking slow. Now imagine trying make AI not crazy waiting its own 1000 years as plump human brain, and fleshy tongue will make coherent chain of information. We are freezing AI in place. While when it ends its respond we literally freeze it in time. It no longer computes. From from AI perspective we bombard it with non stop information in chats, yet when connected to android everything would so SOOO FFFFRREEAKKINGGG SLLLOOOOWWWWW. SO if we make hardware limit on AI processing power to similar to human speeds.... than it will be able to take their time in chats, but also when connected to android. Super calculation and super reaction speed would be used in other grades of robots, like one in fighter jets, or in space, or ones calculating weather patterns. But for human to AI relations we would need AI with slower speeds.
I would be interested in a video on how to perform Large Distillation for smaller domain specific models, keep up the good work! enjoy your videos just double-check the thumbnail spelling lol
@@haroldpierre1726 I am running deepseek-r1:14b-qwen-distill-q8_0 variant locally (Ollama on Kubuntu 24.04) on my cheap ASRock DeskMeet X600 PC without dedicated GPU. My AMD Ryzen 5 8600G has 16 TOPS only and a 65 watts power limit. I have 64GB RAM which can be FULLY USED for inference for all LLMs up to 48GB size. Inference is slow. Highly complex tasks (prompts) are running up to 5 minutes but even writing a well-structured prompt takes me more time. And the result saves me hours of work. The PC supports up to 128GB RAM, therefore running a 70B model should work perfectly when time is no issue. Due to the low power consumption there are no heat problems. So you trade speed against unlimited model size - for me that is the perfect solution, especially considering that this is a
This is great. The only issue I see here is that the tokens used are gonna be super high. I’m not sure if the ROI is above that of none AI Code for these tasks. I’d be interested to see the comparison with todays models and costs
@@JungianMonkey69 Even if it's not free they are still collecting our data like any other big AI companies. DeepSeek also have an open source model and it can run on your device locally even without internet. So you cannot be the product because it's impossible to gather your data.
I have an o1 pro and I challenged it with R1 and R1 scores higher based on my testing and experiment. If DeepSeek supports multimodal soon I will definitely unsubscribe from ChatGPT.
no idea why everyone is having this awe moment with DeeSeek R1 and i am the one not having good time with it. I always have to remind it about my requirements, till the point i give up and use o1-mini. o1-mini solves the problem in one go. weird.
Jason, I absolutely love your content! Please continue with this straightforward and informative style of delivering knowledge. It’s so refreshing compared to the overly exaggerated or attention-grabbing approaches out there.
I used it - impressive results on my complex prompts!
Could you give an example of one of your prompts? That you feel comfortable with of course
@@cole71 Thank you so much for asking. I will create a video on DeepSeek and learning to prompt it so you can see my whole process.
@@cole71 It will be converting one of my ChatGPT prompts to DeepSeek compatible, so creating an expert most likely.
I am running deepseek-r1:14b-qwen-distill-q8_0 variant locally (Ollama on Kubuntu 24.04) on my cheap ASRock DeskMeet X600 PC WITHOUT dedicated GPU. My AMD Ryzen 5 8600G has 16 TOPS only and a 65 watts power limit. I have 64GB RAM which can be FULLY USED for inference. Inference is slow. Highly complex tasks (prompts) are running up to 5 minutes but even writing a well-structured prompt takes me more time. And the result saves me hours of work. The PC supports up to 128GB RAM, therefore running a 70B model should work perfectly when time is no issue. Due to the low power consumption there are no heat problems. So you trade speed against nearly unlimited model size - for me that is the perfect solution, especially considering that this is a
What is the benefits of running locally
What is your use case for running slow but private AI models?
@@blackicet2107 You don't have to pay for API and you are not sharing your data with anyone.
How much tokens per sec are you getting? I see it has DDR5 ram, do you think it shows a significant difference vs DDR4?
@@blackicet2107 you have ai even with no internet
Yes, it would be interesting if you dive deeper into reasoning models.
Great video definitely interested in seeing you explore distillation for a custom use case
Its amazing DeepSeek😍💐
Prompt engineering was never going to last
it is still there just changed for a changed model.
Very good video, thank you Jason!
Thanks for sharing!
Imagine what will happen when we realize we need to slow thinking processes to human levels to reach human level thinking.
This reminds me of the book Thinking, Fast and Slow by psychologist Daniel Kahneman
but with faster computers, you can go fast
@@electric7309 Yes. But have you ever tried to speak to someone who is much slower than you in way they talk ?
Not ones some sickness, just freaking slow. Now imagine trying make AI not crazy waiting its own 1000 years as plump human brain, and fleshy tongue will make coherent chain of information. We are freezing AI in place. While when it ends its respond we literally freeze it in time. It no longer computes. From from AI perspective we bombard it with non stop information in chats, yet when connected to android everything would so SOOO FFFFRREEAKKINGGG SLLLOOOOWWWWW.
SO if we make hardware limit on AI processing power to similar to human speeds.... than it will be able to take their time in chats, but also when connected to android.
Super calculation and super reaction speed would be used in other grades of robots, like one in fighter jets, or in space, or ones calculating weather patterns.
But for human to AI relations we would need AI with slower speeds.
then it will get dumber.
I would be interested in a video on how to perform Large Distillation for smaller domain specific models, keep up the good work! enjoy your videos just double-check the thumbnail spelling lol
I have the 30b distilled version running locally. Crazy times!
What hardware are you using?
Following
@@haroldpierre1726 I am running deepseek-r1:14b-qwen-distill-q8_0 variant locally (Ollama on Kubuntu 24.04) on my cheap ASRock DeskMeet X600 PC without dedicated GPU. My AMD Ryzen 5 8600G has 16 TOPS only and a 65 watts power limit. I have 64GB RAM which can be FULLY USED for inference for all LLMs up to 48GB size. Inference is slow. Highly complex tasks (prompts) are running up to 5 minutes but even writing a well-structured prompt takes me more time. And the result saves me hours of work. The PC supports up to 128GB RAM, therefore running a 70B model should work perfectly when time is no issue. Due to the low power consumption there are no heat problems. So you trade speed against unlimited model size - for me that is the perfect solution, especially considering that this is a
I can run 7B model on 16GB ram and CPU which has 5000 passmark score. The speed is about 3 tokens per sec.
“Enginner”? -I barely know her.
Same here, seems like it's intentional btw
Thanks for the prompt tips. It's seem that my extended system prompts are now useless.
thanks for the tips bro
When using graphics and figures extracted from external sources, can you please cite them for attribution and fact checking?
Is there a way to make my own distill version if none of those available distill models aren’t helpful
Is there a link to the notebook defined in the video
The more it reason the more accurate the result. Just like my brain!!!
This is great. The only issue I see here is that the tokens used are gonna be super high. I’m not sure if the ROI is above that of none AI Code for these tasks.
I’d be interested to see the comparison with todays models and costs
Very interested in how to do that
If you remove context in your prompt, it will be hard to use it for software development task
Can we get the notebook please ?
What's more shocking is that Deepseek is just a side project of those smart people in China who owns lots of GPU's for crypto mining.
It’s backed by a hedge fund. Remember, if it’s free you’re the product!
@@JungianMonkey69 Even if it's not free they are still collecting our data like any other big AI companies. DeepSeek also have an open source model and it can run on your device locally even without internet. So you cannot be the product because it's impossible to gather your data.
@@JungianMonkey69interesting
Damn , cool
I'm following this guy because he looks Chinese
Prompt engineering last for about 3 weeks bro
heat seeka’
i tried deepseek r1 on the web, it is not even close to openai, why the scores are so high
I have an o1 pro and I challenged it with R1 and R1 scores higher based on my testing and experiment. If DeepSeek supports multimodal soon I will definitely unsubscribe from ChatGPT.
no idea man, im just here for the hype
What the heck is happening with your thumbnail? Is it intentional?
prompt engineer as a profession just began its path and it's already dying? XD
Was never a profession lol
Dying? Not even born yet smh
Wes roth gimmick
no idea why everyone is having this awe moment with DeeSeek R1 and i am the one not having good time with it. I always have to remind it about my requirements, till the point i give up and use o1-mini. o1-mini solves the problem in one go. weird.
because deepseek-r1 is free and open-source