Deepseek R1 - The Era of Reasoning models

แชร์
ฝัง
  • เผยแพร่เมื่อ 31 ม.ค. 2025

ความคิดเห็น • 64

  • @FunLau-u9e
    @FunLau-u9e 4 วันที่ผ่านมา +1

    Jason, I absolutely love your content! Please continue with this straightforward and informative style of delivering knowledge. It’s so refreshing compared to the overly exaggerated or attention-grabbing approaches out there.

  • @DrPrompt
    @DrPrompt 11 วันที่ผ่านมา +24

    I used it - impressive results on my complex prompts!

    • @cole71
      @cole71 วันที่ผ่านมา +3

      Could you give an example of one of your prompts? That you feel comfortable with of course

    • @DrPrompt
      @DrPrompt วันที่ผ่านมา

      @@cole71 Thank you so much for asking. I will create a video on DeepSeek and learning to prompt it so you can see my whole process.

    • @DrPrompt
      @DrPrompt วันที่ผ่านมา

      @@cole71 It will be converting one of my ChatGPT prompts to DeepSeek compatible, so creating an expert most likely.

  • @HaraldEngels
    @HaraldEngels 10 วันที่ผ่านมา +22

    I am running deepseek-r1:14b-qwen-distill-q8_0 variant locally (Ollama on Kubuntu 24.04) on my cheap ASRock DeskMeet X600 PC WITHOUT dedicated GPU. My AMD Ryzen 5 8600G has 16 TOPS only and a 65 watts power limit. I have 64GB RAM which can be FULLY USED for inference. Inference is slow. Highly complex tasks (prompts) are running up to 5 minutes but even writing a well-structured prompt takes me more time. And the result saves me hours of work. The PC supports up to 128GB RAM, therefore running a 70B model should work perfectly when time is no issue. Due to the low power consumption there are no heat problems. So you trade speed against nearly unlimited model size - for me that is the perfect solution, especially considering that this is a

    • @blackicet2107
      @blackicet2107 8 วันที่ผ่านมา +1

      What is the benefits of running locally

    • @fool9111z
      @fool9111z 8 วันที่ผ่านมา

      What is your use case for running slow but private AI models?

    • @polygon2744
      @polygon2744 7 วันที่ผ่านมา +4

      @@blackicet2107 You don't have to pay for API and you are not sharing your data with anyone.

    • @rafikyahia7100
      @rafikyahia7100 6 วันที่ผ่านมา

      How much tokens per sec are you getting? I see it has DDR5 ram, do you think it shows a significant difference vs DDR4?

    • @aremeaine
      @aremeaine 3 วันที่ผ่านมา

      @@blackicet2107 you have ai even with no internet

  • @LisaSamaritan
    @LisaSamaritan 11 วันที่ผ่านมา +6

    Yes, it would be interesting if you dive deeper into reasoning models.

  • @carterjames199
    @carterjames199 11 วันที่ผ่านมา +3

    Great video definitely interested in seeing you explore distillation for a custom use case

  • @Multifire910
    @Multifire910 3 วันที่ผ่านมา

    Its amazing DeepSeek😍💐

  • @thapelomasebe3075
    @thapelomasebe3075 7 วันที่ผ่านมา +11

    Prompt engineering was never going to last

    • @Utoko
      @Utoko 6 วันที่ผ่านมา

      it is still there just changed for a changed model.

  • @leoingson
    @leoingson 8 วันที่ผ่านมา

    Very good video, thank you Jason!

  • @PromptHub_AI
    @PromptHub_AI 3 วันที่ผ่านมา

    Thanks for sharing!

  • @Dysputant
    @Dysputant 11 วันที่ผ่านมา +39

    Imagine what will happen when we realize we need to slow thinking processes to human levels to reach human level thinking.

    • @emiliod90
      @emiliod90 11 วันที่ผ่านมา +8

      This reminds me of the book Thinking, Fast and Slow by psychologist Daniel Kahneman

    • @dylanswanson4271
      @dylanswanson4271 10 วันที่ผ่านมา

    • @electric7309
      @electric7309 10 วันที่ผ่านมา

      but with faster computers, you can go fast

    • @Dysputant
      @Dysputant 10 วันที่ผ่านมา

      @@electric7309 Yes. But have you ever tried to speak to someone who is much slower than you in way they talk ?
      Not ones some sickness, just freaking slow. Now imagine trying make AI not crazy waiting its own 1000 years as plump human brain, and fleshy tongue will make coherent chain of information. We are freezing AI in place. While when it ends its respond we literally freeze it in time. It no longer computes. From from AI perspective we bombard it with non stop information in chats, yet when connected to android everything would so SOOO FFFFRREEAKKINGGG SLLLOOOOWWWWW.
      SO if we make hardware limit on AI processing power to similar to human speeds.... than it will be able to take their time in chats, but also when connected to android.
      Super calculation and super reaction speed would be used in other grades of robots, like one in fighter jets, or in space, or ones calculating weather patterns.
      But for human to AI relations we would need AI with slower speeds.

    • @hydrohasspoken6227
      @hydrohasspoken6227 8 วันที่ผ่านมา

      then it will get dumber.

  • @colehouse8636
    @colehouse8636 7 วันที่ผ่านมา +1

    I would be interested in a video on how to perform Large Distillation for smaller domain specific models, keep up the good work! enjoy your videos just double-check the thumbnail spelling lol

  • @DavesNotHereRightNow
    @DavesNotHereRightNow 11 วันที่ผ่านมา +7

    I have the 30b distilled version running locally. Crazy times!

    • @haroldpierre1726
      @haroldpierre1726 11 วันที่ผ่านมา

      What hardware are you using?

    • @nodemodules
      @nodemodules 10 วันที่ผ่านมา +1

      Following

    • @HaraldEngels
      @HaraldEngels 10 วันที่ผ่านมา

      @@haroldpierre1726 I am running deepseek-r1:14b-qwen-distill-q8_0 variant locally (Ollama on Kubuntu 24.04) on my cheap ASRock DeskMeet X600 PC without dedicated GPU. My AMD Ryzen 5 8600G has 16 TOPS only and a 65 watts power limit. I have 64GB RAM which can be FULLY USED for inference for all LLMs up to 48GB size. Inference is slow. Highly complex tasks (prompts) are running up to 5 minutes but even writing a well-structured prompt takes me more time. And the result saves me hours of work. The PC supports up to 128GB RAM, therefore running a 70B model should work perfectly when time is no issue. Due to the low power consumption there are no heat problems. So you trade speed against unlimited model size - for me that is the perfect solution, especially considering that this is a

    • @fool9111z
      @fool9111z 8 วันที่ผ่านมา +1

      I can run 7B model on 16GB ram and CPU which has 5000 passmark score. The speed is about 3 tokens per sec.

  • @antman7673
    @antman7673 11 วันที่ผ่านมา +34

    “Enginner”? -I barely know her.

    • @n.h.son1902
      @n.h.son1902 9 วันที่ผ่านมา +1

      Same here, seems like it's intentional btw

  • @cariyaputta
    @cariyaputta 10 วันที่ผ่านมา +1

    Thanks for the prompt tips. It's seem that my extended system prompts are now useless.

  • @faza210
    @faza210 6 วันที่ผ่านมา

    thanks for the tips bro

  • @ProjectXG10
    @ProjectXG10 3 วันที่ผ่านมา

    When using graphics and figures extracted from external sources, can you please cite them for attribution and fact checking?

  • @Swooshii-u4e
    @Swooshii-u4e 8 วันที่ผ่านมา +1

    Is there a way to make my own distill version if none of those available distill models aren’t helpful

  • @EveryDayAIProgrammer
    @EveryDayAIProgrammer 8 วันที่ผ่านมา +1

    Is there a link to the notebook defined in the video

  • @tony18mo
    @tony18mo 10 วันที่ผ่านมา

    The more it reason the more accurate the result. Just like my brain!!!

  • @TheKyeesa
    @TheKyeesa 6 วันที่ผ่านมา

    This is great. The only issue I see here is that the tokens used are gonna be super high. I’m not sure if the ROI is above that of none AI Code for these tasks.
    I’d be interested to see the comparison with todays models and costs

  • @QuestingDom
    @QuestingDom 11 วันที่ผ่านมา +1

    Very interested in how to do that

  • @commonsensedev
    @commonsensedev 8 วันที่ผ่านมา

    If you remove context in your prompt, it will be hard to use it for software development task

  • @bencipherx
    @bencipherx 6 วันที่ผ่านมา

    Can we get the notebook please ?

  • @rexmanigsaca398
    @rexmanigsaca398 8 วันที่ผ่านมา +12

    What's more shocking is that Deepseek is just a side project of those smart people in China who owns lots of GPU's for crypto mining.

    • @JungianMonkey69
      @JungianMonkey69 7 วันที่ผ่านมา +4

      It’s backed by a hedge fund. Remember, if it’s free you’re the product!

    • @rexmanigsaca398
      @rexmanigsaca398 7 วันที่ผ่านมา

      @@JungianMonkey69 Even if it's not free they are still collecting our data like any other big AI companies. DeepSeek also have an open source model and it can run on your device locally even without internet. So you cannot be the product because it's impossible to gather your data.

    • @cole71
      @cole71 วันที่ผ่านมา

      @@JungianMonkey69interesting

  • @shemkipruto4394
    @shemkipruto4394 11 วันที่ผ่านมา +1

    Damn , cool

  • @AxiomaticPopulace74
    @AxiomaticPopulace74 6 วันที่ผ่านมา +1

    I'm following this guy because he looks Chinese

  • @Aldotronix
    @Aldotronix 6 วันที่ผ่านมา +1

    Prompt engineering last for about 3 weeks bro

  • @morongosteve
    @morongosteve 5 วันที่ผ่านมา

    heat seeka’

  • @zhengtinggan2334
    @zhengtinggan2334 7 วันที่ผ่านมา

    i tried deepseek r1 on the web, it is not even close to openai, why the scores are so high

    • @rexmanigsaca398
      @rexmanigsaca398 7 วันที่ผ่านมา

      I have an o1 pro and I challenged it with R1 and R1 scores higher based on my testing and experiment. If DeepSeek supports multimodal soon I will definitely unsubscribe from ChatGPT.

    • @ben8718
      @ben8718 6 วันที่ผ่านมา

      no idea man, im just here for the hype

  • @n.h.son1902
    @n.h.son1902 9 วันที่ผ่านมา

    What the heck is happening with your thumbnail? Is it intentional?

  • @blago7daren
    @blago7daren 6 วันที่ผ่านมา

    prompt engineer as a profession just began its path and it's already dying? XD

  • @hyperadapted
    @hyperadapted 8 วันที่ผ่านมา +2

    Dying? Not even born yet smh

  • @EONINSIGHTS
    @EONINSIGHTS 11 วันที่ผ่านมา

    Wes roth gimmick

  • @hydrohasspoken6227
    @hydrohasspoken6227 8 วันที่ผ่านมา +3

    no idea why everyone is having this awe moment with DeeSeek R1 and i am the one not having good time with it. I always have to remind it about my requirements, till the point i give up and use o1-mini. o1-mini solves the problem in one go. weird.

    • @wowthank9947
      @wowthank9947 7 วันที่ผ่านมา

      because deepseek-r1 is free and open-source