Qwen 2.5 Coder 32B Run Locally Using Ollama And OpenWebUI - Best Open Source Coder LLM?

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 ม.ค. 2025

ความคิดเห็น • 23

  • @TheFutureThinker
    @TheFutureThinker  2 หลายเดือนก่อน

    Qwen 2.5 Coder series of models are now updated in 6 sizes: 0.5B, 1.5B, 3B, 7B, 14B and 32B.
    Resources:
    github.com/QwenLM/Qwen2.5-Coder?tab=readme-ov-file
    huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct
    ollama.com/library/qwen2.5-coder:32b

  • @duncanh3721
    @duncanh3721 2 หลายเดือนก่อน +3

    Great video, always love the LLM stuff. I gotta try this model out. Thank you!

    • @TheFutureThinker
      @TheFutureThinker  2 หลายเดือนก่อน

      Nice! Finally we have LLM here. I was checking if this channel only got AI image and video people😅😅

  • @HaraldEngels
    @HaraldEngels 2 หลายเดือนก่อน

    Great video. I like especially the code analysis since I am falling into the group of developers who like to direct snippet coding to AI and adapt the response for existing projects.

  • @asddsa-di7wi
    @asddsa-di7wi 2 หลายเดือนก่อน

    Hello. How can I activate the interface on the right? I don't see anything like that.

    • @TheFutureThinker
      @TheFutureThinker  2 หลายเดือนก่อน

      It auto appear with you chat about coding, or image generation.

    • @samg.studio
      @samg.studio 2 หลายเดือนก่อน

      Yeah I have the same question. I cannot have that appear with latest version of Open-Webui and Qwen 14b via ollama.

  • @6tokyt468
    @6tokyt468 2 หลายเดือนก่อน

    Hello, I followed the tutorial but for some reason, the conversations are very very long, about several minutes, yet I have a very high-end PC (I9-14400K, 128G DDR5, RTX 4090 24G, Nvme 4T).
    Any idea why it takes so long for my pc ?

    • @TheFutureThinker
      @TheFutureThinker  2 หลายเดือนก่อน

      how slow is it?

    • @6tokyt468
      @6tokyt468 2 หลายเดือนก่อน

      @@TheFutureThinkerIt take about one to two minutes for a really basic question/answer, following ChatGPT, my computer must run it faster than on the cloud.

  • @valentink1251
    @valentink1251 2 หลายเดือนก่อน

    Which one of the web search engines available in are you using?

  • @ss-tb2ev
    @ss-tb2ev 2 หลายเดือนก่อน

    my pc conf is 4070 ti super 16GB VRAM and i7-14700k 64GB RAM.
    which one should I run 7b or 32b ?

    • @GundamExia88
      @GundamExia88 2 หลายเดือนก่อน

      they have 14b too. I am actually running 14b right now. 32b is a bit slow running with 2x 1070. But 14b is pretty handy, not as good as 32b, but good enough for my needs.

    • @saadatzi
      @saadatzi 2 หลายเดือนก่อน +4

      Computer Resources:
      * RAM (Memory): Essential for storing the model during inference/training.
      * GPU (Graphics Processing Unit): Significantly accelerates matrix operations, crucial for deep learning.
      * CPU (Central Processing Unit): Handles non-matrix operations and data preprocessing.
      * Storage: For model storage, datasets, and any generated data.
      Correspondence Guidelines (Approximate):
      Model Parameter Size RAM Requirements GPU VRAM Requirements Estimated PC Requirements
      Small (100M - 500M) 2-8 GB 2-4 GB Entry-level PC (e.g., Intel Core i3, NVIDIA GeForce GTX 1650)
      Medium (1B - 5B) 16-32 GB 8-16 GB Mid-range PC (e.g., Intel Core i7, NVIDIA GeForce RTX 2060)
      Large (10B - 20B) 64-128 GB 24-32 GB High-end PC (e.g., Intel Core i9, NVIDIA GeForce RTX 3080)
      Extra Large (50B+) 256 GB+ 48 GB+ Specialized Workstation/Server (e.g., Multi-GPU setup, High-End CPUs)

    • @saadatzi
      @saadatzi 2 หลายเดือนก่อน +5

      Resource Requirements by Model Size:
      Small (1-3B parameters):
      - RAM: 4-8GB
      - VRAM: 4-6GB
      - Example: Phi-2, Tiny Llama
      Medium (7-13B parameters):
      - RAM: 16-24GB
      - VRAM: 8-16GB
      - Example: Llama 2 13B
      Large (30-70B parameters):
      - RAM: 32-64GB
      - VRAM: 24GB+
      - Example: Llama 2 70B
      Key Considerations:
      RAM Requirements:
      Minimum RAM ≈ 2x model size
      Example: 7B model needs ~14GB RAM
      Additional overhead for operations
      GPU VRAM:
      4-bit quantization reduces requirements
      Example: 13B model → ~6GB VRAM (4-bit)
      Full precision needs more VRAM

    • @GundamExia88
      @GundamExia88 2 หลายเดือนก่อน +2

      @@saadatzi Thanks! This is very helpful. I've been trying to look for this info!

  • @cerilza_kiyowo
    @cerilza_kiyowo 2 หลายเดือนก่อน

    How your open webui can fecth url?

    • @TheFutureThinker
      @TheFutureThinker  2 หลายเดือนก่อน

      in Admin > Setting > Web Search click Enable

  • @MyaJames-e6z
    @MyaJames-e6z 2 หลายเดือนก่อน

    I use shadow pc, to run this stuff.