Qwen-2.5 Coder 32B: BEST Opensource Coding LLM EVER! (Beats GPT-4o + On Par With Claude 3.5 Sonnet!)

แชร์
ฝัง
  • เผยแพร่เมื่อ 17 ธ.ค. 2024

ความคิดเห็น • 63

  • @intheworldofai
    @intheworldofai  หลายเดือนก่อน +3

    Want to HIRE us to implement AI into your Business or Workflow? Fill out this work form: td730kenue7.typeform.com/to/WndMD5l7
    💗 Thank you so much for watching guys! I would highly appreciate it if you subscribe (turn on notifcation bell), like, and comment what else you want to see!
    📆 Book a 1-On-1 Consulting Call WIth Me: calendly.com/worldzofai/ai-consulting-call-1
    🔥 Become a Patron (Private Discord): patreon.com/WorldofAi
    🧠 Follow me on Twitter: twitter.com/intheworldofai
    Love y'all and have an amazing day fellas. Thank you so much guys! Love yall!

  • @intheworldofai
    @intheworldofai  หลายเดือนก่อน +24

    Microsoft's AI Toolkit - VS Code: FREE AI Extention BEATS Cursor! (GPT-4o + Sonnet 3.5 FREE!: th-cam.com/video/JYikeQ4ySes/w-d-xo.html

  • @alals6794
    @alals6794 หลายเดือนก่อน +6

    Prior to this I was locally running qwen2.5 coder 7B bf16 and it was great, for its size. Can't wait to locally run qwen2.5 coder 32B!

    • @intheworldofai
      @intheworldofai  หลายเดือนก่อน +1

      The 7b model was quite impressive for it's size. This 32b model will surely blow your mind!

    • @lckillah
      @lckillah หลายเดือนก่อน +1

      What kind of workstation are you running to be able to run 32b? I’m new to ML and just now learning so wondering if I’d need an upgrade from m3 18gb Mac Pro.

    • @Kaalkian
      @Kaalkian หลายเดือนก่อน +1

      @@lckillah m4 pro/max atleast 48gb perferabbly 64gb. 64gb would let your qwen2.5 32b with good quant and long context andhave spare ram to use comp lol
      atleast as of nov week2 things are going bezerk day by day

    • @DickerehikariDuck
      @DickerehikariDuck 26 วันที่ผ่านมา

      @@Kaalkian so basically to run this model, the system needs to have at least 32MB RAM?

  • @thisisashan
    @thisisashan หลายเดือนก่อน +20

    As a consistent viewer, I would really love to see a "best of" each category update instead of the constant litany of clickbait.
    It is getting hard for me to want to watch every "Best ever!" video, but I would really love to know what completes what thee best.
    Best LLM router.
    Best anime image gen.
    Best realistic image gen.
    Best logic LLM.
    Best prrogramming LLM.
    etc.
    People do these, but I seldom get any information about the specific LLMs and fine-tunes that you have on this channel.
    Just saying, since you $$$ is based off how much of each video I want.
    Not meant as judgment.
    Thanks for what you do.

    • @intheworldofai
      @intheworldofai  หลายเดือนก่อน +7

      Thanks for the feedback! I really appreciate your input. I’ll definitely consider providing more details on the specific LLMs and fine-tunes featured in the videos. Your support means a lot, and I’m always looking to improve the content for you all!

  • @Foxy_proxy
    @Foxy_proxy หลายเดือนก่อน +4

    What kinda specs do you need to run this locally?

    • @DanaRami93
      @DanaRami93 หลายเดือนก่อน

      Follow

  • @intheworldofai
    @intheworldofai  14 วันที่ผ่านมา

    Athene-v2 72B: NEW Opensource LLM Beats Sonnet & GPT-4o! (Free API): th-cam.com/video/zDMNM1vbMOY/w-d-xo.html

  • @delta-gg
    @delta-gg หลายเดือนก่อน +3

    what hardware would you suggest for running the Qwen-2.5 Coder 32B? Like what graphics card minimum, and what system memory?

    • @paulyflynn
      @paulyflynn หลายเดือนก่อน

      M4 Max 128GB

    • @johndaily9869
      @johndaily9869 หลายเดือนก่อน +1

      4090

    • @antonivanov5782
      @antonivanov5782 หลายเดือนก่อน +3

      RTX 3090 24GB, 32GB RAM

    • @alals6794
      @alals6794 หลายเดือนก่อน +1

      Actually, you can run it on a 8GB VRAM GPU but you have have ALOT of RAM, about 64GB DDR4 for about $100 usd or 64GB DDR5 for less than $200. If you split the load between your low VRAM and massive RAM you do need custom code to make it run. I might do a yt video on that, use it to launch my future AI channel.

    • @andrepaes3908
      @andrepaes3908 หลายเดือนก่อน +2

      If you run it in max specs (32b parameters at fp32+32k context size) I estimate you need 192 of VRAM. This means a 8x3090 Nvidia card config which is not feasible in any consumer grade hardware. But you can run it at 8 integer quant with small quality loss using 48gb VRAM. A 2x3090 config would be enough and I estimate speed at 15 tokens / sec. A recently launched Mac M4 pro mini with 64gb ram would also do it but at half the speed of the 2x3090 config (7.5 tokens/sec).

  • @WolfCat787
    @WolfCat787 หลายเดือนก่อน +1

    so, as 32b, how many memory required to run this model? does single 4090 can handle it?

    • @chucky_genz
      @chucky_genz หลายเดือนก่อน

      For LLM 4090 is ok bro

  • @intheworldofai
    @intheworldofai  27 วันที่ผ่านมา

    Deepseek-R1-Lite: BEST Opensource LLM EVER! Beats Claude 3.5 Sonnet + O1! - (Fully Tested): th-cam.com/video/a61R0HSSwUU/w-d-xo.html

  • @eado9440
    @eado9440 หลายเดือนก่อน +2

    Epic, on par with deep seek 2.5(which is amazing for the price, just a little slow, and has cache). if faster , and hopefully just as cheap, im might just switch over.

    • @intheworldofai
      @intheworldofai  หลายเดือนก่อน +1

      Hopefully the qwen 3 model series will improve on inference speeds

  • @intheworldofai
    @intheworldofai  หลายเดือนก่อน +1

    [Must Watch]:
    Qwen-2.5: The BEST Opensource LLM EVER! (Beats Llama 3.1-405B + On Par With GPT-4o): th-cam.com/video/yd0kgDwkfz0/w-d-xo.htmlsi=Uh2eCpIWYpcY54Hq
    DeepSeek-v2.5: BEST Opensource LLM! (Beats Claude, GPT-4o, & Gemini) - Full Test: th-cam.com/video/mvpkZ1yFy7o/w-d-xo.htmlsi=NR9ChO50-HKJW9Cb
    Bolt.New + Ollama: AI Coding Agent BEATS v0, Cursor, Bolt.New, & Cline! - 100% Local + FREE!: th-cam.com/video/ZooojV4ZDMw/w-d-xo.html

  • @mazinngostoso
    @mazinngostoso หลายเดือนก่อน +16

    Make a video listing the 5 best AI APIs that are completely free 👍

    • @BeastModeDR614
      @BeastModeDR614 หลายเดือนก่อน +1

      Akash network has a free AI API

    • @williamcase426
      @williamcase426 หลายเดือนก่อน

      AI gonna kill us

  • @investfoxy
    @investfoxy หลายเดือนก่อน

    What will you recommended between LM studio vs Pinokio for running LLMs?

  • @intheworldofai
    @intheworldofai  หลายเดือนก่อน

    Gemini Exp 1114: The BEST LLM Ever! Beats o1-Preview + Claude 3.5 Sonnet!: th-cam.com/video/HcxMwM0hwo0/w-d-xo.html

  • @Dom-zy1qy
    @Dom-zy1qy หลายเดือนก่อน

    Might need to buy some tesla m40s and put together a rig. Suprisingly, there seems to be reputable listings on ebay for cheap. 24GB vram for ~$100? Wonder how many token/sec a single card would get.

  • @abc_cba
    @abc_cba หลายเดือนก่อน

    Does anyone know how does it compare to :
    1) Nvidia/llama-3.1-nemotron-70b-instruct
    2) Qwen-2.5-72B-Instruct

    • @djds4rce
      @djds4rce หลายเดือนก่อน

      Better than qwen 72b at coding

  • @avalagum7957
    @avalagum7957 หลายเดือนก่อน

    Possible to use it with Jetbrailns IDE's? If yes, how?

  • @intheworldofai
    @intheworldofai  หลายเดือนก่อน

    Aider UPDATE: The BEST AI Coding Agent BEATS v0, Cursor, Bolt.New, & Cline!: th-cam.com/video/ywKTxA9FU_4/w-d-xo.html

  • @Matthew-s5x7d
    @Matthew-s5x7d หลายเดือนก่อน

    when will a local LLM work with cline that is worth it. thats all the matters!!

  • @AK-ox3mv
    @AK-ox3mv หลายเดือนก่อน

    Qwen 2.5 coder 32b Q4_km gguf that has %99 accuracy compared to fp16 version, is just 20GB and any graphic card with over 20GB vram like nvidia 3090 from year 2020 can run it with about 20~30 tokens per seconds which is like online models.

  • @BeastModeDR614
    @BeastModeDR614 หลายเดือนก่อน

    does ollama have it?

    • @johndaily9869
      @johndaily9869 หลายเดือนก่อน +1

      yes, 51 minutes ago

    • @alals6794
      @alals6794 หลายเดือนก่อน +3

      Don't get the quantized version. I think Ollama only offers quantized versions but for coding you want max precision/accuracy aka non quantized. Huffingface has them without quantization, I think, and might even offer it via free API if you register on their site. Ok, you need basic python to use it via API, true.

    • @latlov
      @latlov หลายเดือนก่อน

      2:15

  • @chucky_genz
    @chucky_genz หลายเดือนก่อน

    Ollama is still the king 😊

  • @francoisjunior.
    @francoisjunior. หลายเดือนก่อน

    my mine is slowly when intalled , why that's mean?

  • @Matelight_IT
    @Matelight_IT หลายเดือนก่อน

    32b on 24gb vram works like magic because the default quantization weight only 20gb so you can also add something like ~30 minute of video subtitles (8k? tokens)

  • @null-db6or
    @null-db6or หลายเดือนก่อน

    You use so old lmstudio version

  • @gog2462
    @gog2462 หลายเดือนก่อน +1

    it is not working with 99% of apps for coding because it has no tools insructed do do stuff and no prompts for code checking etc... basically it knows not much also about diff and normall stuff that is needed when you try to code with it

    • @gog2462
      @gog2462 หลายเดือนก่อน

      without retraining for coding with coding apps like vsc... it is useless :) on maybee if you have "developer" that dont know nothing about coding then it can be used for snake game and nothing more

    • @latlov
      @latlov หลายเดือนก่อน

      How about WordPress plugins development?

  • @yzw8473
    @yzw8473 หลายเดือนก่อน

    Task #1 is way too easy. Even qwen2.5-coder-0.5B-instruct can handle it right.

  • @cstephen4695
    @cstephen4695 หลายเดือนก่อน

    i was able to make it draw a butterfly svg. but still look ugly. 🤣🤣

  • @Windswept7
    @Windswept7 หลายเดือนก่อน

    This is concerning, a warning should be given regarding the ties to the Chinese Communist Party.

  • @neoreign
    @neoreign หลายเดือนก่อน

    wow wow wo, dude go slow man lol not all of us are coders. you speak so fast!