The new Qwen Model blows my mind! Qwen 72B 2.5-VL tested

แชร์
ฝัง
  • เผยแพร่เมื่อ 1 ก.พ. 2025

ความคิดเห็น • 18

  • @jeffwads
    @jeffwads วันที่ผ่านมา +5

    Due to the Deepseek model, I had to pay Tekboost a visit to grab one of their monster 1.5TB Z8 G4 dual Xeon 18core rigs for around 4K. I need to have that full 128K context and don't care about the slower inference, as long as I get great results with the output.

    • @GosuCoder
      @GosuCoder  วันที่ผ่านมา

      That is an incredible setup you have there! I'm jealous lol

    • @hydrohasspoken6227
      @hydrohasspoken6227 14 ชั่วโมงที่ผ่านมา

      what if the model quickly gets obsolete and you need much horsepower for the next breakthrough model?

  • @NetmainRiadh
    @NetmainRiadh 2 ชั่วโมงที่ผ่านมา

    I used it to copy handwritten text, and it was really amazing, mind-blowing. All the other models either refused to work or just failed to correctly transcript the text and gave me unreadable text.

  • @DaveEtchells
    @DaveEtchells วันที่ผ่านมา +2

    Mind-blowing that a 72B model can code this well - *and* has vision too!
    I haven’t looked, but how big is the model file if you were to download it (if you even can)?
    72B seems like it could run on something like a 128GB M4 Pro machine.
    The model companies are all moving slow on the agent front, but this and the R1 revelation is telling me that quite small local “thinking” models will be plenty capable of taking on the personal assistant role, checking email, managing appointments, taking meeting notes, etc, etc. The hangup is the potential screw-ups if they give the assistant full access to your computer and accounts, but pure capability-wise, we’re getting really really close to small, efficient and practical assistant agents running locally.
    🤯

    • @GosuCoder
      @GosuCoder  23 ชั่วโมงที่ผ่านมา

      When I tried to set it up in Sagemaker I was gonna test it in 128GB. Digits can’t come fast enough for me! Have you tried Operator by chance? I’ve heard mixed things on that but haven’t had a chance to actually try it.

  • @UCs6ktlulE5BEeb3vBBOu6DQ
    @UCs6ktlulE5BEeb3vBBOu6DQ วันที่ผ่านมา

    QWEN2.5 72B Instruct has been my daily driver ever since it launched. Bang for the buck in code quality is unmatched. My favorite part is when you get errors in ChatGPT like «Too many concurrent connections» and I'm fully functional offline at home!

  • @JaysThoughts-q5e
    @JaysThoughts-q5e วันที่ผ่านมา

    12:55 Yeah, I gotta have API access. I'm not a great programmer but I know how to loop things via the API.

  • @UCs6ktlulE5BEeb3vBBOu6DQ
    @UCs6ktlulE5BEeb3vBBOu6DQ วันที่ผ่านมา

    Also I do not use the newer 2.5 VL because I believe they had to prune the 2.5 Instruct model to make room for the vison stuff which I do not care for.

    • @GosuCoder
      @GosuCoder  วันที่ผ่านมา

      I’m curious where you saw that? It makes me want to test the VL model side by side with original 2.5 now.

    • @UCs6ktlulE5BEeb3vBBOu6DQ
      @UCs6ktlulE5BEeb3vBBOu6DQ วันที่ผ่านมา

      @@GosuCoder I read a lot about it all and the vision part is literally another smaller model merged with the main model. I don't fully understand it all but I believe that layers 1 to x are text generation and layers x to y are vision stuff. So they obviously had to prune stuff from the text model to fit the vision part. Also the logic power of the model has to be lower since 72B in the VL is the combined total. I'd bet 100% that the older 72B instruct has more logic power.

  • @vncstudio
    @vncstudio 16 ชั่วโมงที่ผ่านมา

    The VL is excellent for handwritten transcription and much better than Llama 3.2 Vision 90B. Just a tad below Google Pro Vision and Claude Sonnet. The challenge of handwritten transcription used to be very difficult before 2024.

    • @GosuCoder
      @GosuCoder  13 ชั่วโมงที่ผ่านมา

      That’s awesome! I need to test that more.

  • @maxxflyer
    @maxxflyer 12 ชั่วโมงที่ผ่านมา

    closed source and no api?

    • @GosuCoder
      @GosuCoder  12 ชั่วโมงที่ผ่านมา +1

      Open weights but having trouble deploying it myself.

  • @xryanlee
    @xryanlee 23 ชั่วโมงที่ผ่านมา +1

    So you can actually pay to load Qwen's API as well.

    • @GosuCoder
      @GosuCoder  23 ชั่วโมงที่ผ่านมา +1

      Really? I tried finding that.

    • @KrahsThe
      @KrahsThe 10 ชั่วโมงที่ผ่านมา

      i've been trying to connect to it from cline but was unable to.