Hugging Face GGUF Models locally with Ollama

แชร์
ฝัง
  • เผยแพร่เมื่อ 24 ม.ค. 2025

ความคิดเห็น • 54

  • @TJTHEFOOTBALLPROPHET
    @TJTHEFOOTBALLPROPHET 8 หลายเดือนก่อน +5

    You had me 20 seconds into this video - INSTANT FOLLOW! 🔥 MUCH LOVE FROM NEW ORLEANS AND THANK YOU❤

  • @Seedlinux
    @Seedlinux 6 หลายเดือนก่อน +1

    Thanks a lot for this! I was looking for a video that explained this exact topic and you did it in such a simple and efficient way. Kudos!❤

  • @davidtindell950
    @davidtindell950 6 หลายเดือนก่อน +2

    Thank You. Very Useful and Very Timely!

  • @AlperYilmaz1
    @AlperYilmaz1 ปีที่แล้ว +4

    I'm loving ollama.. it was a breeze to run a 7B model locally in my humble laptop.. If you can show us how to fine-tune a model locally and then use it with ollama, that would be awesome..

    • @learndatawithmark
      @learndatawithmark  ปีที่แล้ว +2

      That's on my list of things to figure out! Do you have any particular thing you'd like to fine tune for?

    • @AlperYilmaz1
      @AlperYilmaz1 ปีที่แล้ว +2

      @@learndatawithmark I have some data which is bullet points of facts as input and a coherent paragraph from the bullet points as output. There are tons of tutorials but it's too chaotic, each one is using different base model quantization, prompt template, output format.. So, it would be great to create a model which can be run wiith ollama.. If you are interested, I can share the data..

    • @namannavneet587
      @namannavneet587 10 หลายเดือนก่อน

      ​@@AlperYilmaz1 Hi sir, i am having same problem. Have you found a way to do it?

  • @kelvinli2970
    @kelvinli2970 8 หลายเดือนก่อน +5

    How to do it in window?

  • @elias9298
    @elias9298 ปีที่แล้ว

    Thank you very much! After reading the documentation and spending 30 minutes asking GPT-4 (like an idiot) on how to do it, I was confused. Looks like I did something wrong with writing the path. Your video is clear and easy-to-understand .

    • @learndatawithmark
      @learndatawithmark  11 หลายเดือนก่อน

      Glad it helped :)

    • @boysrcute
      @boysrcute 11 หลายเดือนก่อน

      and what was your fix? it's really bothering me

    • @NyxesRealms
      @NyxesRealms 8 หลายเดือนก่อน

      @@boysrcute To ensure the paths are correctly formatted to avoid parsing errors in scripts or tools that may interpret backslashes as escape characters, you should use double backslashes (\\) or replace them with forward slashes (/). You'll want to fix your modelfile path to have forward slashes.

  • @faqs-answered
    @faqs-answered 8 หลายเดือนก่อน +1

    Thanks for the video!

  • @fra4897
    @fra4897 11 หลายเดือนก่อน +1

    what about the prompt template in the model file?

    • @learndatawithmark
      @learndatawithmark  11 หลายเดือนก่อน

      I didn't bother doing anything with that, but there are a lot more options for refining that than there were at the time I created the video. You can see all the options here - github.com/ollama/ollama/blob/main/docs/modelfile.md

  • @GAllium14
    @GAllium14 ปีที่แล้ว +3

    Great vid bro, you're very underrated

  • @luisEnrique-lj4fq
    @luisEnrique-lj4fq 5 หลายเดือนก่อน

    thanks a lot, ollama¡¡

  • @ConsultingjoeOnline
    @ConsultingjoeOnline 11 หลายเดือนก่อน

    Great video, Thank you!

  • @bigglyguy8429
    @bigglyguy8429 6 หลายเดือนก่อน +2

    Pro tip - use pretty much anything else EXCEPT Ollama, because Ollama demands you only use their version of GGUF files. Every other such software uses normal GGUF you can download yourself. Don't get trapped inside Ollama.

    • @EcoTekTermika
      @EcoTekTermika 5 หลายเดือนก่อน

      You know you can use any model from HF in Ollama right?

    • @bigglyguy8429
      @bigglyguy8429 5 หลายเดือนก่อน

      @@EcoTekTermika Only if you mess around creating model files for it and don't mind it being given a Sha hash as a name... which is my point, it's just a HF GGUF, but with extra steps that stop you using the models for anything else

    • @learndatawithmark
      @learndatawithmark  5 หลายเดือนก่อน

      Yeh it's frustrating - would be way better if you could use the HF models directly and they had a separate file for any of the meta data they create.
      Main reason I end up using Ollama a lot of the time is that I can't find a reliable place to get quantised versions of the models. I use to download them from TheBloke, but he stopped doing them since about January!

  • @emil8367
    @emil8367 11 หลายเดือนก่อน

    Thank you for sharing information 🙂
    Is it possible to use Ollama SHA256:... files like to gguf / bin, etc to use it also in eg LM Studio / Autogen, etc or these files are useless outside Ollama due to the fact that is used hashing on them (if so) ?

    • @learndatawithmark
      @learndatawithmark  11 หลายเดือนก่อน +1

      That is a good question and I don't know the answer right now. Need to take a look at the Ollama code to see exactly what those files contain!

  • @stanTrX
    @stanTrX 9 หลายเดือนก่อน

    thanks. what about lfs models? (many models don't have gguf model files?) do i skip something here?

    • @learndatawithmark
      @learndatawithmark  9 หลายเดือนก่อน +1

      I'm not sure what a lfs model is? Also you don't have to use Ollama to run models - there's always Hugging Face's transformers library which works with all their models too.

  • @alsoeris
    @alsoeris 7 หลายเดือนก่อน

    Is there no way to run a .gguf file that i already have downloaded? if not i guess ill have to stick with LMstudio & TGwebui

    • @learndatawithmark
      @learndatawithmark  7 หลายเดือนก่อน +1

      Nope, AFAIK you can't run gguf directly, you always have to convert it to Ollama's format. Other tools for running GGUF files directly are llama.cpp or llamafile in case you haven't heard of them!

  • @andremota247
    @andremota247 5 หลายเดือนก่อน

    wouldnt it be nice to make a model that fuses them all In one master AI?

  • @erdagkucukdemirci
    @erdagkucukdemirci 9 หลายเดือนก่อน

    What is the asitop alternative for Linux?

    • @learndatawithmark
      @learndatawithmark  9 หลายเดือนก่อน

      asitop says it's the alternative for nvtop, so perhaps one of those functions?
      "A Python-based nvtop-inspired command line tool for Apple Silicon (aka M1) Macs."

  • @DihelsonMendonca
    @DihelsonMendonca 6 หลายเดือนก่อน

    There must be a new and easy method currently to run GGUF models in Ollama or in Open WebUI. Please update this method. 🎉❤

    • @learndatawithmark
      @learndatawithmark  6 หลายเดือนก่อน +1

      As far as I know, this is still the way to run GGUF models with Ollama. I wish you could use GGUF files directly, it would be so much easier!
      I haven't used Open WebUI, I'll take a look at that.
      If you want command line tools that can run GGUF files directly, take a look at llamafile or llama.cpp
      github.com/ggerganov/llama.cpp
      github.com/Mozilla-Ocho/llamafile

    • @DihelsonMendonca
      @DihelsonMendonca 6 หลายเดือนก่อน

      ​@@learndatawithmark Thanks for answering. Indeed, my interest is in running them on Ollama, due to the new Open WebUI, which is the most marvelous thing invented. Open WebUI is a frontend to Ollama, presenting an interface like Chatgpt, with history of the conversations, talks with LLMs completely hands free, you talk and listen, and you can input your texts, PDFs, RAG, makes LLMs access internet in real time, upload images, multimodality, it's fantastic. You definitely need to test it. The problem with it is that it's based on Ollama, so, I use LM Studio, with dozens of Hugging face models, and I love them. I would like these models to be used in Open WebUI, but they are in GGUF format, that's why I found your video, in order to use gguf models in Ollama. 🙏👍💥

  • @DarkTrapStudio
    @DarkTrapStudio ปีที่แล้ว

    I don't understand, Ive donwnloaded Ollama, run the first model, then hit : Ollama run dolphin-mixtral:latest
    But it was too slow, I don't understand all the part you went too, you used huggingface but I just want to run the model and install it I don't know anything about that poetry hugging face part

    • @learndatawithmark
      @learndatawithmark  ปีที่แล้ว

      If you only want to use one of Ollama's built in libraries you don't need to do any of the stuff in this video - you can do what you said. But keep in mind that dolphin-mixtral is one of the biggest models so it will be slower than the other ones. Perhaps try dolphin-mistral to see if that gives better performance.

    • @DarkTrapStudio
      @DarkTrapStudio ปีที่แล้ว

      @@learndatawithmark I will try to figure everything out on how to to all this thanks

  • @GigsTaggart
    @GigsTaggart 11 หลายเดือนก่อน

    Why all the complication instead of just using curl to get the GGUF from the website you were already on?

    • @learndatawithmark
      @learndatawithmark  11 หลายเดือนก่อน

      Good question! I have used cURL a few times, but in the instructions it suggested that you should use the CLI tool. I haven't actually looked at the code to see if/what it does differently to cURL.

  • @MichaelDomer
    @MichaelDomer 9 หลายเดือนก่อน

    A clear example that shows that most nerds lack the skills and the willingness to provide simple solutions for non-nerds, the majority of us humans.
    It takes a less than a minute to install LM Studio and an AI model and have your code checked, a role played, questions answered, etc in private. Why are guys like you often ignore the regular users, and only focus on fellow nerds?

    • @learndatawithmark
      @learndatawithmark  9 หลายเดือนก่อน

      Hey - Thanks for your feedback. You're right - LM Studio is an easier approach, but I didn't know that it supported GGUF until I read your post.
      This video was also more about showing how you can use Ollama to run models even if the Ollama folks haven't already added it as a library.
      To be fair they do now seem to add new models so quickly that you rarely have that situation.

  • @starflyai
    @starflyai 5 หลายเดือนก่อน

    no offense but you sound like jacksepticeye

    • @learndatawithmark
      @learndatawithmark  5 หลายเดือนก่อน

      I have no idea who that is! Should I be offended?!

  • @jalapenos12
    @jalapenos12 7 หลายเดือนก่อน

    wtf poetry? noooooo why?

    • @learndatawithmark
      @learndatawithmark  7 หลายเดือนก่อน

      What should I use instead?!

    • @jalapenos12
      @jalapenos12 7 หลายเดือนก่อน

      @@learndatawithmark Anaconda is my standard, but maybe I should learn poetry. I think I'm just frustrated that so many tutorials assume familiarity with so many tech stack options.

  • @sylvercloud7970
    @sylvercloud7970 3 หลายเดือนก่อน

    When I try this I do get the newly added model listed, but when I run it it fails with an error Error: Post "127.0.0.1:11434/api/generate": EOF
    I have this problem on the the safetensor-coverted-to-GGUF model that I imported into ollama. Other much larger models like 34B Llava run fine. For conversion to GGUF I used ruSauron/to-gguf-bat method on Github. Any ideas where this went wrong?
    Thanks

    • @learndatawithmark
      @learndatawithmark  3 หลายเดือนก่อน +1

      Is there anything in the Ollama log file? ~/.ollama/logs/server.log
      It might also be worth seeing whether you can use the safetensors model directly. I showed how to that here - th-cam.com/video/DSLwboFJJK4/w-d-xo.html
      Equally it might just be that the model isn't supported by llama.cpp, which is the underlying library that Ollama uses to run inference on the LLMs.

    • @sylvercloud7970
      @sylvercloud7970 3 หลายเดือนก่อน

      @@learndatawithmark Thanks for the response. I did see your other video on using safetensors in ollama but ran into a roadblock right at the start 😊. I posted a message on that video too with a question on where to find “template” info but for some reason the message doesn’t appear. Template isn’t mentioned on the model page and I am at wits end trying to find it. I skipped template info and needless to say it didn’t work.

    • @sylvercloud7970
      @sylvercloud7970 3 หลายเดือนก่อน

      I just retried posting a message on the other video hopefully it registers it this time. :))

    • @sylvercloud7970
      @sylvercloud7970 3 หลายเดือนก่อน

      @@learndatawithmark I don't see a logs folder in .ollama. All that's in there is history id_ed25519 id_ed25519.pub
      I'm looking in the root folder. Is it elsewhere?
      Thanks

    • @sylvercloud7970
      @sylvercloud7970 2 หลายเดือนก่อน

      I tried another conversion method using llama.cpp and on the last step this is what I got:
      INFO:hf-to-gguf:Loading model: GOT-OCR2_0
      ERROR:hf-to-gguf:Model GOTQwenForCausalLM is not supported
      You were right, perhaps not every safetensor can be converted to gguf.