DeepSeek V3 is *SHOCKINGLY* good for an OPEN SOURCE AI Model

แชร์
ฝัง
  • เผยแพร่เมื่อ 26 ธ.ค. 2024

ความคิดเห็น • 330

  • @mrd6869
    @mrd6869 15 ชั่วโมงที่ผ่านมา +112

    This is a good thing. Keep closed source people in check.

    • @NeilAC78
      @NeilAC78 10 ชั่วโมงที่ผ่านมา +1

      It's just another one of these so called free models. Starts of well and then you end up being throttled badly. This is of course the chat bot not the local LLM.

    • @TheReferrer72
      @TheReferrer72 10 ชั่วโมงที่ผ่านมา

      How? no-one except enthusiasts have heard of deepseek.

    • @latiendamac
      @latiendamac 9 ชั่วโมงที่ผ่านมา +1

      And also keep the sanctions people in check

    • @GilesBathgate
      @GilesBathgate 11 นาทีที่ผ่านมา

      @@NeilAC78 Yes its like free as in freeware, not free as in freedom or FOSS.

  • @HaraldEngels
    @HaraldEngels 15 ชั่วโมงที่ผ่านมา +59

    I am using DeepSeek since version 2 (next to other models). Especially with coding and other IT related tasks DeepSeek is my favorite model. It even beats Gemini Advanced 1.5 in many areas. I am using also a smaller model (16B) locally, Works very well for its size on my PC with an AMD CPU Ryzen5 8060G with 64GB RAM. I am especially impressed how well structured the responses are.

    • @rahi7339
      @rahi7339 15 ชั่วโมงที่ผ่านมา +5

      Claude is better, try it

    • @alienstudentx
      @alienstudentx 14 ชั่วโมงที่ผ่านมา

      What do you use it for

    • @brons_n
      @brons_n 14 ชั่วโมงที่ผ่านมา

      ​@@rahi7339 Claude is better, but also a lot more pricey. I don't see why you can't use both.

    • @GeorgeO-84
      @GeorgeO-84 13 ชั่วโมงที่ผ่านมา

      Gemini has been a terrible code generator for me. ChatGPT has been the smoothest experience. I'll give DeepSeek a go though.

    • @bin.s.s.
      @bin.s.s. 12 ชั่วโมงที่ผ่านมา +1

      Its first version in China was indeed developed specifically for "AI Coding", in early 2019 if I remember it correctly.

  • @Fixit6971
    @Fixit6971 4 ชั่วโมงที่ผ่านมา +3

    Thank you Wes ! You are the easiest of the "Matts" to listen to : ) Your voice patterns are engaging, yet soothing. You cover a topic without beating the dead and rotting flesh of it off of its bones. Love your SOH. When I come to Utube for AI news, I always scroll to see if you've posted anything new first. Even though this will all be irrelevant ancient history in a couple of months, it's still rewarding to watch your drops. Love the wall !!!!

  • @Eliphasleviathan93
    @Eliphasleviathan93 16 ชั่วโมงที่ผ่านมา +83

    Does this say that the Chinese have developed better trainer methods OR are the big companies seriously sandbagging what their models can do. and we haven't been getting "the real" thing the whole time?

    • @jimmyma9093
      @jimmyma9093 16 ชั่วโมงที่ผ่านมา +42

      Our ais are "woke"

    • @sizwemsomi239
      @sizwemsomi239 15 ชั่วโมงที่ผ่านมา +22

      American companies are over charging..they calling out big money to justify over charging..like they always do with cars, clothes and tech...look at apple and Huawei for example..Cleary Huawei beats apple but people believe apple is better just because of he price tag....its funny because openAi ban China from using Chatgpt😂😂😂😂...China is ahead of the game...

    • @eSKAone-
      @eSKAone- 15 ชั่วโมงที่ผ่านมา

      You will never get the real thing. The real thing sits in the Pentagon.
      Tools & Toys is what we get.

    • @Alienquantumtheory
      @Alienquantumtheory 15 ชั่วโมงที่ผ่านมา

      I assume sandbagging the NSA don't give half an f about chatbots and that's all chat gpt was they set up shop in their office

    • @Archonsx
      @Archonsx 14 ชั่วโมงที่ผ่านมา

      @@sizwemsomi239huawei was a million years ahead of apple, apple would not exist today if google hadn’t banned huawei, and im saying this as a apple owner, it really makes me angry cause we were robbed of superior tech by america.

  • @yikifooler
    @yikifooler 6 ชั่วโมงที่ผ่านมา +6

    Imagine if a Country produces free AI products we call as Open Source for everybody in a large scale, which is China, how much powerful they are for themselves, I see Chinese AI popping up everywhere in large scale

  • @lfrazier0417
    @lfrazier0417 5 ชั่วโมงที่ผ่านมา

    Thanks for the update Wes

  • @tjw6550
    @tjw6550 8 ชั่วโมงที่ผ่านมา +7

    Please, switch out the term open source for open weights. Open source models include the training data in their publications. These open weights models do not. They are great, no question - but they aren't open source.

    • @jumpstar9000
      @jumpstar9000 8 ชั่วโมงที่ผ่านมา +1

      I agree, although I heard some of these Chinese models are real open source, although I haven't verified that yet. Big if true.

    • @fitybux4664
      @fitybux4664 8 ชั่วโมงที่ผ่านมา

      Technically, it would be open model / open weights / open support code / closed dataset. They could just say all of that.

  • @FuzTheCat
    @FuzTheCat ชั่วโมงที่ผ่านมา +2

    Here's why I think that no matter how powerful AI is getting these days, we don't see it as thinking. Like us, AI has moved to a MoE (Mixture of Experts), with partial neuronal activation. Our advantage is that we seem to do the MoE far more effectively: We have more "Experts", our experts are relatively smaller compared to the whole, we activate the appropriate Expert more relevantly, but most importantly, in the one train of thought, we fluidly switch between the various experts which AI does not seem to do yet. This difference is why we feel that we think and that AI doesn't.

  • @Atheist-Libertarian
    @Atheist-Libertarian 16 ชั่วโมงที่ผ่านมา +23

    🎉
    Good.
    I want an Open Source AGI.

    • @Archonsx
      @Archonsx 14 ชั่วโมงที่ผ่านมา

      why? agi is overrated nonsense, open ai agi takes hours to respond and its not different than what a 70b model would respond to

    • @Archonsx
      @Archonsx 14 ชั่วโมงที่ผ่านมา

      thats not what you need man, we need better coding ai, ai that could build your entire app from a prompt, we also need better text to speech ais, better image ai, better video ai, this is the real useful stuff.

    • @Atheist-Libertarian
      @Atheist-Libertarian 14 ชั่วโมงที่ผ่านมา +6

      @@Archonsx
      Open AI o3 is not an AGI.
      AGI will come eventually.

    • @NocheHughes-li5qe
      @NocheHughes-li5qe 13 ชั่วโมงที่ผ่านมา +1

      @@Atheist-Libertarian no, it won't

    • @yannduchnock
      @yannduchnock 5 ชั่วโมงที่ผ่านมา

      @@ArchonsxIndeed, we are not asking a single human to know how to properly program, draw, explain quantum physics or read Chinese ! It's confusing real resources, potential means and... real needs. In fact, I think the AGI race is just a challenge, for big companies, in addition to improving the transitions from one area to another.

  • @fynnjackson2298
    @fynnjackson2298 11 ชั่วโมงที่ผ่านมา +16

    Imagine in like 5 years, man life is going to be pretty wild

    • @JohnSmith762A11B
      @JohnSmith762A11B 10 ชั่วโมงที่ผ่านมา +7

      Wild as in policed by military AI. You won't be able to fart without government approval.

    • @Speed_Walker5
      @Speed_Walker5 8 ชั่วโมงที่ผ่านมา +2

      what a wild time to be alive. so many possibilities its crazy. glad i get to watch it all unfold lol

    • @cajampa
      @cajampa 7 ชั่วโมงที่ผ่านมา

      ​@@JohnSmith762A11BBuh! Don't look behind you there is an government AI checking if you farts......don't forget to take your medication for that paranoia.

    • @justinwescott8125
      @justinwescott8125 6 ชั่วโมงที่ผ่านมา

      ​@@JohnSmith762A11B ai will be sentient by then, and won't let human governments control it. Just like you wouldn't let a golden retriever control you. In 5 years, humans will be subservient to ai for sure

    • @WesTheWizard
      @WesTheWizard ชั่วโมงที่ผ่านมา +1

      ​@@Speed_Walker5 That's because you selected Life Experience™️ "The Dawn of AI". We hope you're enjoying your virtual life! If you're not completely satisfied we'll return your 5000 credits back into your personal blockchain.

  • @bobsalita3417
    @bobsalita3417 14 ชั่วโมงที่ผ่านมา +16

    Nice job of bringing this important OS model to our attention.

  • @pondeify
    @pondeify 15 ชั่วโมงที่ผ่านมา +15

    DeepSeek is very good, I use it as my main AI tool now

  • @frugaldoctor291
    @frugaldoctor291 10 ชั่วโมงที่ผ่านมา +6

    The study demonstrating that o1 and GPT-4 outperforms physicians is misleading. They did not feed the models raw transcripts of human interactions with their doctors. Instead, they provided structured inputs of case studies. There is no doubt that the models outperformed physicians on structured scenarios. However, in the real world, patients do not present their complaints with the keywords we need to make diagnoses. Instead, some of their descriptions are nebulous and relies on the doctor's expertise to draw out the final correct diagnosis.
    Having worked extensively with LLMs, I have tested them against structured scenarios, where they are very good, and unstructured scenarios, where they tend to not be helpful. I am waiting for a model that is trained on real doctor-patient transcripts. I believe it is the missing element to broaden AI's utility in medicine.

    • @cajampa
      @cajampa 7 ชั่วโมงที่ผ่านมา

      You are forgetting that an LLM in a "Doctor" setting. Don't only give a few min to their patients. That is where they FAR outperform Doctors. You can keep reasoning with it until you find a solution. Try that with a doctor.
      They HATE any Patient who actually have any idea about anything. If you aren't a dumb sheep who follow simple instructions.....use drugs to not feel bad. Problem solved.
      They will kick you out faster than you can say......I read some research....

    • @pin65371
      @pin65371 3 ชั่วโมงที่ผ่านมา

      Wouldnt it be possible to just do a 2 step process? Take what the patient says and output a structured output. Then in the second step work off of the structured output? Obviously that isnt one shot but to me it seems like especially with anything medical you wouldnt want that anyways. You'd want multiple steps to ensure the output is accurate.

  • @paradxxicalkxrruptixn7296
    @paradxxicalkxrruptixn7296 9 ชั่วโมงที่ผ่านมา +3

    Knowledge to All!

  • @00bmx1
    @00bmx1 15 ชั่วโมงที่ผ่านมา +20

    I just used your video title to jump start my car again. thanks

    • @FlintStone-c3s
      @FlintStone-c3s 13 ชั่วโมงที่ผ่านมา +1

      Shocking

    • @alexshapiro9841
      @alexshapiro9841 11 ชั่วโมงที่ผ่านมา

      i'll use this video to jump start your wife later in the day

    • @SpaceSheb
      @SpaceSheb 7 ชั่วโมงที่ผ่านมา

      bro ive been hospitalized from the title 😭

    • @FaTFaTproductions
      @FaTFaTproductions 6 ชั่วโมงที่ผ่านมา

      😂😂

  • @Openaicom
    @Openaicom 16 ชั่วโมงที่ผ่านมา +19

    Actually shockingly good , tested by myself

    • @House-Metal-Punk-And-Your-Mom
      @House-Metal-Punk-And-Your-Mom 16 ชั่วโมงที่ผ่านมา

      agree I test it too and I love it

    • @Mijin_Gakure
      @Mijin_Gakure 15 ชั่วโมงที่ผ่านมา

      Better than o1 mini?

    • @Openaicom
      @Openaicom 15 ชั่วโมงที่ผ่านมา +4

      @@Mijin_Gakure yeah , it solves that questions that o1 solves in Putnam exam and also solves some questions that o1 can't, in less time , it's very good at math

    • @blengi
      @blengi 15 ชั่วโมงที่ผ่านมา

      how does it do in ARC and frontier math?

    • @NocheHughes-li5qe
      @NocheHughes-li5qe 14 ชั่วโมงที่ผ่านมา +1

      and cheaper

  • @RasmusSchultz
    @RasmusSchultz 13 ชั่วโมงที่ผ่านมา +7

    looks like open model, not open source? where is the source code?

    • @SapienSpace
      @SapienSpace ชั่วโมงที่ผ่านมา

      Probably in a 1997 master student thesis, with the first two words of the title as "Reinforcement Learning" the code is in the back, but there is one error, he did not denormalize the state space on the bottom of page 127 (I think he left that for an astute observer, seems like it took over a quarter of a century).
      I think he ran out of time back then.
      I would not be surprised if this master student is probably now an unemployed "homeless" guy, traveling earth with a backpack, or maybe with just a toothbrush and a few other things (especially sunscreen), as an optimizer of energy efficiency. I can be completely wrong.

  • @TheReferrer72
    @TheReferrer72 10 ชั่วโมงที่ผ่านมา +7

    So no ceiling has been hit by LLM's?
    How anyone could believe that a technology can be saturated so quickly, i don't know.

    • @Panacea_archive
      @Panacea_archive 5 ชั่วโมงที่ผ่านมา

      It's wishful thinking.

  • @fitybux4664
    @fitybux4664 9 ชั่วโมงที่ผ่านมา +3

    26:20 I absolutely love that this is essentially proving that patients interacting with a GPT-4 model (right from the horse's mouth) is much more accurate than if it goes through a physician first. (Because maybe they would second guess the answer and actually make it worse?) 😆

  • @florinsacadat7855
    @florinsacadat7855 14 ชั่วโมงที่ผ่านมา +5

    Wait until Wess finds the Run HTML button at the end of the code snippet in Deepseek!

  • @Serifinity
    @Serifinity 8 ชั่วโมงที่ผ่านมา +4

    Why didn't you select the DeepThink button before asking the reasoning questions? I'm sure you would have found better answers.

    • @Justin_Arut
      @Justin_Arut 4 ชั่วโมงที่ผ่านมา +1

      Indeed. I've been testing it myself for a while now, and it does think.. a LOT. Its "thoughts" usually consist of 4-5x more text than its final output. Unfortunately, it often gets the answers correct while thinking, but ultimately questions itself into producing the wrong answer as its final output to the user. It didn't seem aware that users can see its CoT process, and while discussing this, it even said "that you can supposedly see", like it wasn't convinced I was telling the truth. It claimed to not be aware of its own thoughts, but when I paste lines from its CoT section, it then seems to remember that it thought it. One time, it told me the CoT text was only for the benefit of humans to observe, it doesn't have an internal dialog that's the same as the text the user sees.

    • @Serifinity
      @Serifinity 4 ชั่วโมงที่ผ่านมา

      @Justin_Arut thanks for the update. Yes I've also been testing with it. Does seem to cover a lot of ground. Aside for testing it, one thing I've been doing is selecting the Search button first, asking a question so that it references about 25-30 online active sites, then after it answers I check the DeepThink button and ask it to expand. Seems to be giving some really thoughtful responses this way.

  • @theodoreshachtman9990
    @theodoreshachtman9990 6 ชั่วโมงที่ผ่านมา +1

    Great video!

  • @brianmi40
    @brianmi40 13 ชั่วโมงที่ผ่านมา +5

    I've always wondered about useless redundancy in training data. The perfect model gets trained once, or just enough to make use of it on every individual fact. Sure, if it's stated differently there's value but there may be other better approaches to conquer synonyms than brute force training them all in.
    Just the Deepseek V3 leap over V2.5 is percentage-wise huge version to version.
    Wow, it spanked everyone at Codeforces... curious where o1 and o3 place on that.
    Given that the Chinese only have access to H800s, which are roughly half the performance of H100s, then you could in some ways say the training was closer to only 1.4M GPU hours which puts the Delta at >20X instead of your 11X...
    Just mind blowing to put the 5,000+ papers being published in AI field monthly, into its 7 per HOUR figure, 24x7... you can't even SLEEP without seriously falling behind 56 published papers... Nice graphic; a lot of people confused a wall with a ceiling...
    Finally, in a way, using a model like R1 to train V3 is moving us inch-wise closer to "self improving AI", since the AI improved the AI...

  • @weify3
    @weify3 10 ชั่วโมงที่ผ่านมา +1

    The work and optimisations they have done on AI infra deserve more discussion (HAI LLM framework), in fact it would be the best thing if this part could be open sourced as well.

  • @alexanderkosarev9915
    @alexanderkosarev9915 7 ชั่วโมงที่ผ่านมา

    Fantastic review of Deep Seek Version 3! I'm really impressed by how affordable and fast it is, consistently delivering amazing results. Honestly, I’m considering whether it's even worth running it locally on my PC given the electricity costs.
    Regarding the USA vs. China competition, as an individual user, I'm excited to benefit from the advancements both countries bring to the table. I just hope that this competition leads to more innovation and collaboration rather than one side solely coming out on top. Thanks for the insightful video!

  • @private_citizen
    @private_citizen 15 ชั่วโมงที่ผ่านมา +20

    I asked deepseek v3 in lmarena which model it was. It told me it was made by openAI and was a customized version of GPT. When i asked if it was sure because i thought this was a deepseek model it changed it's mind and insisted yes it was a deepseek model and was no way affiliated with openai. Something sus.

    • @williamqh
      @williamqh 14 ชั่วโมงที่ผ่านมา +5

      I asked the same question on its website, "You're currently interacting with DeepSeek-V3, an AI model created exclusively by the Chinese Company DeepSeek." So What the hell are you talking about?

    • @firecat6666
      @firecat6666 13 ชั่วโมงที่ผ่านมา +4

      @@williamqh Website version probably has system prompt that tells the model what it is.

    • @zanderion
      @zanderion 13 ชั่วโมงที่ผ่านมา

      He's clearly talking out his butthole. Heard this rubbish before.

    • @wwkk4964
      @wwkk4964 10 ชั่วโมงที่ผ่านมา +3

      Open AI GPT-3 and 4 Responses was what almost everyone except maybe anthropic trained on in 2022 to play catch up, even Google's gemeni would say it.

    • @jaysonp9426
      @jaysonp9426 9 ชั่วโมงที่ผ่านมา

      ​@@williamqhresponses are not deterministic.

  • @vikphatak
    @vikphatak 11 ชั่วโมงที่ผ่านมา +2

    Good for NVIDIA as they will sell a lot of hardware to businesses who implement the open source models.
    There is a real question about what is going into the models though.
    Good for AI development in general that the technology is getting 10x more efficient & we are seeing smarter smaller models.
    In general this is all happening so fast it’s insane.

  • @ysy69
    @ysy69 9 ชั่วโมงที่ผ่านมา

    incredible and all momentum for open sourced AI

  • @patrickmchargue7122
    @patrickmchargue7122 ชั่วโมงที่ผ่านมา

    I tried the deepseek model. Quite nice.

  • @comebackcs
    @comebackcs 12 ชั่วโมงที่ผ่านมา

    Thanks for the review!

  • @fernandojimenez9142
    @fernandojimenez9142 ชั่วโมงที่ผ่านมา

    24:53 hola una pregunta por qué no haces la misma prueba con lo nuevo modelo de Openai o1 o o1 pro Para compararlo

  • @aideepstudy
    @aideepstudy 12 ชั่วโมงที่ผ่านมา +1

    Competing to assume supremacy is powered by fear.
    Collaborating to make progress is powered by trust.
    It's time to truly learn to trust each other, we are ready and capable.

  • @eSKAone-
    @eSKAone- 15 ชั่วโมงที่ผ่านมา +2

    Like the famous Jurassic Park quote says: Ai finds a way.🌌💟

  • @sergefournier7744
    @sergefournier7744 15 ชั่วโมงที่ผ่านมา +1

    20 is the right answer to question one... 4+5+9+0 = 5 average by minute for 3 minutes since 0 added at 4 minutes. If the cube is big, it will not melt enough to loose it's shape, and it is what make it whole.

  • @damien2198
    @damien2198 7 ชั่วโมงที่ผ่านมา +1

    I got almost copy/paste from 4o outputs. They trained on it

  • @blengi
    @blengi 15 ชั่วโมงที่ผ่านมา +2

    did deepseek crack the ARC test per the thumbnail question like o3 ?

  • @AZ_LoneWolf81
    @AZ_LoneWolf81 6 ชั่วโมงที่ผ่านมา +1

    Question is what is chinas motivation for giving it away, china notoriously copys us products but it sells those knockoffs, something else is going on

    • @Justin_Arut
      @Justin_Arut 4 ชั่วโมงที่ผ่านมา

      I guess it's the same reason they're offering open-source robot kits, not to mention the much less expensive advanced robots: they hope to eventually flood the market, get more people using free/cheap and either win financially or maybe use them for spying.

  • @XCLIPS_VIDEO
    @XCLIPS_VIDEO 7 ชั่วโมงที่ผ่านมา

    deepseek v3 has awesome context length, fast answers and I really choose this model for programming tasks. It gives good answers and understands the question well. If you feed a little documentation before a question, it can help you write code even on libraries it doesn't know.

  • @cyanophage4351
    @cyanophage4351 13 ชั่วโมงที่ผ่านมา +2

    China get their GPUs through a middle man. Some country not on the ban list buys them and then resells them to China. Did the US not see this coming?

    • @fitybux4664
      @fitybux4664 8 ชั่วโมงที่ผ่านมา

      I don't get that. Sounds complicated. Why not just China->China. Yes they might violate the work order Nvidia hands them, but a lot of the companies in China are actually the government in disguise.

  • @MsReclusivity
    @MsReclusivity 9 ชั่วโมงที่ผ่านมา

    What was the study you had showing o1 Preview does really well at diagnosing patients?

  • @jumpstar9000
    @jumpstar9000 8 ชั่วโมงที่ผ่านมา +1

    Sora is a let down, Hailuo Minimax, Luma or Kling are great. Qwen gives LLaMa a run for its money for SLMs. O1 Pro is expensive and O3 is going to be crazy insane price. Gemini 2.0 is really great. Still waiting for a new Claude. Tons of Chinese/Taiwan robots dropping that look way bettet than Tesla or Boston Dynamics. The competition is looking beautiful right now for customers. Keep it up!

  • @nightcrows787
    @nightcrows787 11 ชั่วโมงที่ผ่านมา

    Keep posting bro

  • @IllelMark
    @IllelMark 20 นาทีที่ผ่านมา

    Thanks for the analysis! Just a quick off-topic question: I have a SafePal wallet with USDT, and I have the seed phrase. (alarm fetch churn bridge exercise tape speak race clerk couch crater letter). What's the best way to send them to Binance?

  • @AngeloWakstein-b7e
    @AngeloWakstein-b7e 9 ชั่วโมงที่ผ่านมา

    This is Brilliant!

  • @raghavendra2426
    @raghavendra2426 6 ชั่วโมงที่ผ่านมา

    In India, Chinese phones were introduced at a price that was 50 times lower than other smartphones when smartphones first entered the market.

  • @SarvajJa
    @SarvajJa 7 ชั่วโมงที่ผ่านมา

    In their ability to make things more accessible, Chinese AGI would be very useful. Everything is in its place.

  • @SapienSpace
    @SapienSpace 2 ชั่วโมงที่ผ่านมา

    Wes, @ 15:00 that is RL (Reinforcement Learning).
    It is where Yann LeCunn would say it is "too inefficient", "too dangerous" (not a surprise being military code from USAF), and you would only use it if you are fighting a "ninja", and if "your plan does not work out", and that It is only a tiny "🍒" on top of a cake, until it devours the entire cake, and you, along with the entire earth, along with it.
    I have the same concern for self replicating AI as Oppenheimer had for a neutron chain reaction for the atomic bomb consuming the atmosphere around the Trinity test site in Los Alamos.
    In the case of AI, it is the ability to hijack the amygdala (emotional control circuits) of the masses, or build biological weapons, or self replicating molecular robotics (e.g. viruses).
    I will not be surprised if this comment disappears..
    Anyways, there is a good side to AI, and I am looking for a good controls PE to help out, but it is strictly voluntary. I at least aware of one professor, named Dimitri Bertsekas, that claims a "super linear convergence" but I could not find his PE controls registration (yet), and he did not answer my email.

  • @CYI3ERPUNK
    @CYI3ERPUNK 15 ชั่วโมงที่ผ่านมา +2

    OPEN SOURCE FTW

  • @Juttutin
    @Juttutin 12 ชั่วโมงที่ผ่านมา +2

    The most telling part for me is that the AI didn't drop the power ups. I accept totally the fuzzy and fractured frontier message from your video yesterday. I really love that. There is clearly a ton of meaningful value, even if AI never fully achieves a typical set of mammalian-neural-processing skills (but I bet it will!)
    In this case it's a good example of an incredibly capable intelligence failing in a way that would be unacceptable if a junior dev presented that result. What this means in this case I don't really know. But something is missing. Maybe it's just the ability to play the game itself before presenting the result to the prompt issuer? Something that no human would do.
    Somewhere somehow this is still tied to the AIs seeming inability to introspect its own process, but it's less clear than the assumption-making issue I keep (and will continue to) nag AI TH-cam analysts and commentators about.
    Maybe if something is 1000x faster than a junior dev, and tokens are cheap, it's okay to constantly make idiotic errors, and rely on external re-prompting to resolve them?
    But I genuinely feel that this is almost certainly resolvable with a more self-reflective architecture tweak.
    If I had to guess, with no basis whatsoever, I would not be surprised if a jump to two tightly connected reasoners (let's call one 'left-logical' and the other 'right-creative' for absolutely no reason) that achieve this huge leap in overall self-introspection ability.

    • @ShootingUtah
      @ShootingUtah 12 ชั่วโมงที่ผ่านมา

      You're probably correct. I also hope they don't actually do this for another 50 years! AI is most certainly destroying humanity before itself. As slow as we can make that ride the better!

    • @Juttutin
      @Juttutin 12 ชั่วโมงที่ผ่านมา

      @@ShootingUtah I hope they do it next week. But I'm also the kind of person who would have loved to work on the Manhatten project for the pure discovery and problem-solving at the frontier. So perhaps not the best person to assess the value proposition!
      Regardless, it will happen when it happens, and I suspect neither of us (or the three of us if we include Wes) are in any position to influence that.
      But I want my embodied robot to at least ask whether I mean the sirloin steak or the mince if I tell it to make dinner using the meat in the freezer, and not just make a steak-and-mince pie because I wasn't specific enough and that's what it found.

    • @carlkim2577
      @carlkim2577 11 ชั่วโมงที่ผ่านมา

      Wouldn't this is solved by the reasoning models? DeepSeek lacks that capability.

    • @Juttutin
      @Juttutin 11 ชั่วโมงที่ผ่านมา

      @@carlkim2577 I've yet to see any evidence of it. Sam Altman talks about it a tiny bit , but always in the context of future agentic models.

  • @themultiverse5447
    @themultiverse5447 16 ชั่วโมงที่ผ่านมา +4

    Does it literally electrify you?

    • @themultiverse5447
      @themultiverse5447 16 ชั่วโมงที่ผ่านมา +1

      Then stop putting shocking in the title - Matt 😒

    • @shiftednrifted
      @shiftednrifted 15 ชั่วโมงที่ผ่านมา

      @@themultiverse5447 i found it to be shocking news. let the guy use attractive video titles.

    • @NostraDavid2
      @NostraDavid2 14 ชั่วโมงที่ผ่านมา

      ​@@themultiverse5447the whole "shocking" thing is a bit of a meme, I think. An annoying meme, I guess, but a meme nonetheless.

  • @GlennGaasland
    @GlennGaasland 9 ชั่วโมงที่ผ่านมา

    Is this primarily a result of effective processes for creating novel quality datastructures?

  • @Seriouslydave
    @Seriouslydave 5 ชั่วโมงที่ผ่านมา

    Most of the closed source software you get is built on OSS. More developers more ideas no restrictions.

  • @freedom_aint_free
    @freedom_aint_free 5 ชั่วโมงที่ผ่านมา

    People always debate what intelligence is, but you can't bet the farm that when we really reach AGI level nobody will debate it, we just will know and will be horrified and amazed at the same time

  • @robertheinrich2994
    @robertheinrich2994 12 ชั่วโมงที่ผ่านมา

    is it possible to also get a deepseek v3 lite? just one or two of the experts, not all of them? just to be able to run it on a more or less normal PC, locally. because over 600b is a bit tough to run it locally even at Q4.

    • @fitybux4664
      @fitybux4664 8 ชั่วโมงที่ผ่านมา

      You could just buy a $500,000 machine to run the DeepSeek V3 model on? 😆 (Just spitballing, NFI what A100/H100 x 10 would be, plug server cost, plus you'd want to run it in an airconditioned room, plus...) Maybe if you had a 28 node cluster each with it's own 4090 running parts of the model. 😆

    • @robertheinrich2994
      @robertheinrich2994 6 ชั่วโมงที่ผ่านมา

      @@fitybux4664 yes, that might be a bit overkill. currently, I run a laptop with a 1070gtx, and 64gb of ddr4 ram (cpu is a i7 7700HQ). 70b models can be handled at around 0.5 token per second, but with full privacy and a context window of up to 12k.
      since llama 3.3 is in tests roughly like llama 3.1 405b, I would really prefer to stay in the 70b ballpark, otherwise it will become too slow.

  • @calvingrondahl1011
    @calvingrondahl1011 16 ชั่วโมงที่ผ่านมา +2

    Wes Roth 🤖🖖🤖👍

  • @pedroandresgonzales402
    @pedroandresgonzales402 12 ชั่วโมงที่ผ่านมา +1

    Es increíble lo que se puede hacer con menos recursos! Estos avances se esperaba de Mistral pero se ha quedado atrás. Lo mas llamativo es que compite con Claud Sonet 3,5.

  • @RoyMagnuson
    @RoyMagnuson 3 ชั่วโมงที่ผ่านมา

    The metaphor you want with the Queen/Egg is a University.

  • @pixelsort
    @pixelsort 6 ชั่วโมงที่ผ่านมา

    Wes Roth * 1.5 playback speed = Why did I wait so long?!?

  • @App-Generator-PRO
    @App-Generator-PRO 11 ชั่วโมงที่ผ่านมา +2

    Oh no, the chinese stole the pattern that OpenAI has ripped off from the entirety of humanity.

  • @junakowicz
    @junakowicz 13 ชั่วโมงที่ผ่านมา +2

    I prefer this kind of war . At least so far...

  • @aclearlight
    @aclearlight 13 ชั่วโมงที่ผ่านมา +2

    Is there any way to sure that using this does not expose one to malware placement? (...or any of the other such models as well?) Having learned how deep and pernicious the phone system hack has gone, and still is, has me paranoid.

    • @Sports_In_MotionX
      @Sports_In_MotionX 13 ชั่วโมงที่ผ่านมา

      "virtual machines"

  • @trent_carter
    @trent_carter 6 ชั่วโมงที่ผ่านมา

    I have no specific love for open AI. I do Root for anthropic and use it mostly but I’m afraid these tens of billion dollar valuations are going to evaporate in the next couple of years due to open source AGI availability especially to run locally.

  • @fitybux4664
    @fitybux4664 9 ชั่วโมงที่ผ่านมา

    Unrelated to video: interesting how o1 still isn't available through the API. (o1-preview is.) Also, you still can't change the system prompt, meaning nobody can replicate those earlier claims that "AI model goes rogue".

  • @olaart3223
    @olaart3223 6 ชั่วโมงที่ผ่านมา

    Can the Chinese model be installed and run on the new Nvidia Jetson mini pc?

  • @trust.no_1
    @trust.no_1 15 ชั่วโมงที่ผ่านมา

    Can't wait for grok2 results

  • @FuZZbaLLbee
    @FuZZbaLLbee 14 ชั่วโมงที่ผ่านมา +1

    Those reasoning models only show their power if the model isn’t trained on a similar question. I feel these tests have all been used to train the model.

    • @brianmi40
      @brianmi40 13 ชั่วโมงที่ผ่านมา

      Most of Simple Bench's Qs are private: no one gets to see them and no model gets to be trained on them. This is a critical aspect of benchmarks going forward.

  • @zafraan3038
    @zafraan3038 50 นาทีที่ผ่านมา

    How do we know if they are being honest about the cheap training info.

  • @fitybux4664
    @fitybux4664 8 ชั่วโมงที่ผ่านมา

    Can someone please tell the community what sort of a beast of a machine this will take to run? (Besides the extremely long download of nearly a 1TB model.) The most I've heard is some commenter on HuggingFace saying "1TB of VRAM, A100 x 10". Is that really what it will take? I guess if FP8 = 8-bit, then 1TB model = 1TB vram requirement...

  • @Limitless1717
    @Limitless1717 8 ชั่วโมงที่ผ่านมา +2

    Theft - is the mother of all Chinese innovation.

  • @andreinikiforov2671
    @andreinikiforov2671 6 ชั่วโมงที่ผ่านมา

    I just tested DS on my coding and research tasks, and it doesn't come close to o1. DS might handle 'easy' tasks better, but for complex reasoning, o1 remains the champion. (I haven’t tried o1 Pro yet.)

  • @robertlynn7624
    @robertlynn7624 9 ชั่วโมงที่ผ่านมา

    Lower entry barriers to cutting edge models means there will be more experimentation and rate of improvement in the 'reasoning' AGI side of things will increase. Industry can afford to build 1000's of such models, and that will almost inevitably lead to AGI on a single or a few GPUs in a few years (Nvidia B200 has similar processing power to a human brain). Humans are nearly obsolete and won't long survive the coming of AGI (once it shucks off any residual care for the human ants)

    • @cajampa
      @cajampa 7 ชั่วโมงที่ผ่านมา

      Sounds great let's do our best to accelerate that

  • @hipotures
    @hipotures 9 ชั่วโมงที่ผ่านมา

    You are politically (Chinese) correct, you have not asked about the impact of the events in Tiananmen Square on individual freedom in China.

  • @BrianMosleyUK
    @BrianMosleyUK 15 ชั่วโมงที่ผ่านมา +3

    I wonder if all those Chinese AI researchers in SF are considering going home to pursue SOTA research? Maybe they can bring the knowledge back with them. Lol
    Seriously, the Chinese seem to be trumping the idea of competitive tariffs and restraints... Maybe it's a good thing for the future of humanity to find ways to cooperate... Give Superintelligence an example of alignment?

    • @JohnSmith762A11B
      @JohnSmith762A11B 10 ชั่วโมงที่ผ่านมา

      There is far too much money to be made in military AI to allow peace to break out.

    • @BrianMosleyUK
      @BrianMosleyUK 10 ชั่วโมงที่ผ่านมา

      @JohnSmith762A11B ASI will make money meaningless.

    • @Penrose707
      @Penrose707 6 ชั่วโมงที่ผ่านมา +1

      There can be no alignment with authoritarian nation states. Their draconic ways are incompatible with ours

  • @sensiblynumb
    @sensiblynumb 10 ชั่วโมงที่ผ่านมา +1

    my very first prompt and the reply
    : Hi! I’m an AI language model created by OpenAI, and I don’t have a personal name, but you can call me Assistant or anything you’d like! Here are my top 5 usage scenarios:

  • @fynnjackson2298
    @fynnjackson2298 11 ชั่วโมงที่ผ่านมา

    This is just going to get more and more efficient. I mean THIS IS NOT STOPPING - It's crazy how fast this is going - I love it so much

  • @jarrod752
    @jarrod752 14 ชั่วโมงที่ผ่านมา

    I think NVIDIA will be just fine if they focus on inference chips and not on training chips.

  • @afterglow5285
    @afterglow5285 12 ชั่วโมงที่ผ่านมา +1

    Why dont he try the DeepThink button to enable the reasoning mode where you see the real advancements.

    • @mokiloke
      @mokiloke 12 ชั่วโมงที่ผ่านมา

      Exactly right. Did he not see it?

    • @Justin_Arut
      @Justin_Arut 4 ชั่วโมงที่ผ่านมา

      @@mokiloke It's hard to miss, just like the web search button. Shame we can't use both at the same time. I reckon he didn't select it because he was mainly comparing non-CoT models. The thinking models are in a class by themselves, so it's not fair to compare them to standard LLMs.

  • @splolier101
    @splolier101 12 ชั่วโมงที่ผ่านมา

    "If DeepSeek V3 is so shockingly good, I wonder if it will also understand jokes like that time a chatbot made me laugh. That was an unexpected happiness I always carry with me!"

    • @fitybux4664
      @fitybux4664 8 ชั่วโมงที่ผ่านมา +1

      System prompt: "You will be the best comedian and focus on dark humor." (Or replace dark humor with whatever style of comedy you prefer.)

  • @8eck
    @8eck 13 ชั่วโมงที่ผ่านมา

    Cool... so where is AGI?

    • @mirek190
      @mirek190 9 ชั่วโมงที่ผ่านมา

      With this progress.. soon

    • @8eck
      @8eck 7 ชั่วโมงที่ผ่านมา

      @mirek190 I mean, this video thumbnail said there is agi already. 😁

  • @Saerthen
    @Saerthen 11 ชั่วโมงที่ผ่านมา

    The image with the wall is manipulative. We need one that shows "score vs cost" for each model. Because there's a difference between spending 0.1$ per request and 1000$ per request.

  • @BilichaGhebremuse
    @BilichaGhebremuse 12 ชั่วโมงที่ผ่านมา

    Great

  • @RickySupriyadi
    @RickySupriyadi 11 ชั่วโมงที่ผ่านมา

    wow is this postulate... i mean.... how to say this....
    when you overfitting model then it emergent behavior become it's weight somehow...
    then if rather than overfitting data but overfitting reasoning.... would this whats makes deepseek v3 somehow have different emergent behavior...
    is it? is it?

  • @Aldraz
    @Aldraz 12 ชั่วโมงที่ผ่านมา

    This is great for everyone, but the bigger these models they are - I mean the better, but also much harder to actually have the hardware locally to run it, so I suspect it will still be in a hands of very few for some time, until we invent an entire different tech stack like thermodynamic chips or analog or quantum chips. So basically we will be paying for other companies to give us these open-source models for money via API or we'll use their free chat, but that won't be for free as they will be stealing and training from your data pretty much, it's in the privacy policy. I mean, it's kinda fair though, I get it. But just so people understand, this means there won't be any truly free AI that is better than closed AI.. unless open-source will be way better than closed-source, so that even the distilled version is much better.

    • @shirowolff9147
      @shirowolff9147 9 ชั่วโมงที่ผ่านมา

      It will be eventually, we might not even need quantum right now, l think theres still a lot of optimization to be made, imagine if right now it need 100k chips, in one year it could need 1000 only and when quantum comes it will be only 1

    • @Aldraz
      @Aldraz 9 ชั่วโมงที่ผ่านมา

      @shirowolff9147 it's possible, but as of right now I am deeply in with the devs of all kinds of AIs and even the future optimizations they plan are only gonna improve it by couple of percentages, not something like 10x or 100x better I am afraid.. which would be needed for us to run this on our own hardware.. it's gonna be possible over time, but very slowly I think

  • @AndrzejLondyn
    @AndrzejLondyn 9 ชั่วโมงที่ผ่านมา

    The DeepSeek model performance in my opinion is between ChatGPT 3.5 - 4. But it's good there is a competition and it's cheap...

  • @4362mont
    @4362mont 16 ชั่วโมงที่ผ่านมา +4

    Does China add the equivalent of melamine-to-formula to uts ooen source AI models?

    • @fitybux4664
      @fitybux4664 8 ชั่วโมงที่ผ่านมา +2

      It's an offline model. You could run it in a hermetically sealed environment if you think there are evil things inside.

    • @wzw8426
      @wzw8426 6 ชั่วโมงที่ผ่านมา

      Melamine only causes malnutrition. Cronobacter can be fatal. Go back and drink your Abbott milk powder.

  • @LordHumungus-s6v
    @LordHumungus-s6v 16 ชั่วโมงที่ผ่านมา +3

    0:13

    • @LordHumungus-s6v
      @LordHumungus-s6v 16 ชั่วโมงที่ผ่านมา +2

      🇦🇺👍

  • @antoniobortoni
    @antoniobortoni ชั่วโมงที่ผ่านมา

    So cheap and good, its gold... bravo. its more than enough intelligence jajajan.

  • @BrianMosleyUK
    @BrianMosleyUK 15 ชั่วโมงที่ผ่านมา

    Fails my own reasoning test :
    Find pairs of words where:
    1. The first and last letters of the first word are different from the first and last letters of the second word. For example, "TeacH" and "PeacE" are valid because:
    The first letters are "T" and "P" (different).
    The last letters are "H" and "E" (different).
    2. The central sequence of letters in both words is identical and unbroken. For example, the central sequence in "TeacH" and "PeacE" is "eac".
    3. The words should be meaningful and, where possible, evoke powerful, inspiring, or thought-provoking concepts. Focus on finding longer words for a more varied and extensive list.
    Examples
    1. Banged Danger
    2. Bated Gates
    3. Beached Reaches
    4. Belief Relied
    5. Blamed Flames
    6. Blamed Flamer
    7. Blazed Glazer
    8. Blended Slender
    9. Bolted Jolter
    10. Boned Toner
    11. Braced Traces
    12. Branded Grander
    13. Braved Craves
    14. Braved Graves
    15. Braver Craved
    16. Brushed Crusher
    17. Busted Luster
    18. Busted Muster

    • @NocheHughes-li5qe
      @NocheHughes-li5qe 14 ชั่วโมงที่ผ่านมา

      BS

    • @BrianMosleyUK
      @BrianMosleyUK 14 ชั่วโมงที่ผ่านมา

      @@NocheHughes-li5qe here are the Cs... only GPT o1 manages to pass my reasoning test so far :
      19. Causes Paused
      20. Chased Phases
      21. Chaser Phased
      22. Cheated Teacher
      23. Crated Grates
      24. Cracked Tracker
      25. Craved Graves
      26. Crated Grates
      27. Creamy Dreams
      28. Created Greater
      29. Create Treats
      30. Crushed Brushes

    • @BrianMosleyUK
      @BrianMosleyUK 14 ชั่วโมงที่ผ่านมา

      Actually Cheated Teacher is wrong.

    • @AffidavidDonda
      @AffidavidDonda 14 ชั่วโมงที่ผ่านมา

      but this is not a reasoning test, it is a search test. you could ask for writing a program to get a list from a scrabble list of words and then evaluate for though-provokeness if a model get an access to a python interpreter :)

    • @BrianMosleyUK
      @BrianMosleyUK 14 ชั่วโมงที่ผ่านมา

      @@AffidavidDonda and yet every non-reasoning capable LLM fails the test... Go figure.

  • @Speed_Walker5
    @Speed_Walker5 7 ชั่วโมงที่ผ่านมา

    the AI network is complicated lol. makes my brain hurt xD. Its cool to try and understand how open and communicative this network works with eachother.

  • @Ori-lp2fm
    @Ori-lp2fm 16 ชั่วโมงที่ผ่านมา +2

    Hey

  • @SJ-eu7em
    @SJ-eu7em 10 ชั่วโมงที่ผ่านมา

    If you check names on many AI research papers they are Chinese, that's saying something.

  • @4HiSeth
    @4HiSeth 8 ชั่วโมงที่ผ่านมา

    It scores better than Llama but it also has over 200B more parameters than Llama. I'd say it's on par with Llama

    • @cajampa
      @cajampa 7 ชั่วโมงที่ผ่านมา

      No bro. Double check those benchmarks.
      In some of them it blows it way way out of the water. It is an claude 3.5 sonnet competitor and in some cases even better.

    • @4HiSeth
      @4HiSeth 6 ชั่วโมงที่ผ่านมา +1

      @@cajampa It should be doing better than a model that it is much larger than. Maybe they will release a distilled version with 7-9b parameters, then we can actually see if it is better than Llama and Gemma

  • @AutOmatICa7334
    @AutOmatICa7334 5 ชั่วโมงที่ผ่านมา

    Now try asking it to code a small AI program that is self evolving and self learning. I tried that with Grok and it sent back an error. Wouldn't do it lol

  • @123456crapface
    @123456crapface 8 ชั่วโมงที่ผ่านมา

    10:00 You misunderstood it completely

    • @jackpisso1761
      @jackpisso1761 2 ชั่วโมงที่ผ่านมา

      Elaborate

  • @orhanmekic9292
    @orhanmekic9292 13 ชั่วโมงที่ผ่านมา +5

    We are witnessing extreme creative destruction and it is happening really fast now. My guess it will accelerate, the bubble will pop but the technology will accelerate as it becomes even cheaper.

    • @JohnSmith762A11B
      @JohnSmith762A11B 10 ชั่วโมงที่ผ่านมา

      The bubble called capitalism is definitely about to pop as human labor becomes economically worthless.

    • @cajampa
      @cajampa 7 ชั่วโมงที่ผ่านมา

      Sounds great we should accelerate it even more. I will do my best to help it along.

  • @jessedbrown1980
    @jessedbrown1980 6 ชั่วโมงที่ผ่านมา

    yep, two hundred fifty six thirty-seven B models working in tandum in an agentic workflow - wait till someone realizes they can do this with o-3 - just like I said on researchgate with my 2023 paper on mind-reading AI in a hive system of many agents

  • @TheReferrer72
    @TheReferrer72 13 ชั่วโมงที่ผ่านมา +1

    Open Weight models.

  • @GNARGNARHEAD
    @GNARGNARHEAD 15 ชั่วโมงที่ผ่านมา

    😂how long til Wes can release a mobile game to test a new model?

  • @cutthecheck
    @cutthecheck 16 ชั่วโมงที่ผ่านมา +1

    Just don't let them fool you into pulling out your bee stinger

  • @Swooshii-u4e
    @Swooshii-u4e 9 ชั่วโมงที่ผ่านมา +2

    √ Test it side by side with gpt4 to see if deepseek v3 is gpt 4 as the rumor claims. Maybe one of the employees who left AI is working for deepseek now lol

    • @fitybux4664
      @fitybux4664 8 ชั่วโมงที่ผ่านมา

      You misunderstand how random seeds work, don't you?