Unlock the Power of AI with Ollama and Hugging Face

แชร์
ฝัง
  • เผยแพร่เมื่อ 25 พ.ย. 2024

ความคิดเห็น • 84

  • @beachfeet6055
    @beachfeet6055 หลายเดือนก่อน +25

    Matt, for each GGUF model listed on HuggingFace there is a black "Use This Model" button. This opens a drop down of providers. Ollama is listed. Clicking that gives the whole "ollama run" command with URL for the model metadata. Also on the right side of each page are links for various Quant sizes. Each of these also has the "Use This Model" button. Pretty handy!

    • @technovangelist
      @technovangelist  หลายเดือนก่อน +9

      Nice. Another new thing. For a long time it felt like ollama was intentionally left out of that list. Thanks for pointing this out

    • @jimholmes692
      @jimholmes692 หลายเดือนก่อน

      Not all models have the button though

  • @RasmusRasmussen
    @RasmusRasmussen หลายเดือนก่อน +1

    Fantastic news! Of course, I immediately checked it on OpenWeb-UI and had no problem loading one of my experimental huggingface models from the web interface. Very cool.

  • @aristotelesfernando
    @aristotelesfernando 11 วันที่ผ่านมา

    Thanks Matt! Another very interesting video

  • @digital-economy
    @digital-economy 24 วันที่ผ่านมา +1

    Thanks for recommenting the Ollama Chrome Extension. It makes life easier. Maybe you can explain how to find great models on HuggingFaces. I just downloaded the famous classic models and have no idea how to benefit from this huge database of AI stuff. Finding your video I first thought, this video brings the answer how to find in HF the right models.

  • @JJJJ-r3u
    @JJJJ-r3u หลายเดือนก่อน +21

    Whenever there is a command, I would hope to see a terminal with the command on the screen. It is easier to remember if one can see than just hear it.

    • @thenaman047
      @thenaman047 หลายเดือนก่อน +1

      @@JJJJ-r3u agreee

    • @SubhojeetNeogy
      @SubhojeetNeogy หลายเดือนก่อน +1

      @@JJJJ-r3u true

    • @technovangelist
      @technovangelist  หลายเดือนก่อน +1

      Great. That’s why I showed it the first few times

    • @thenaman047
      @thenaman047 หลายเดือนก่อน +3

      @@technovangelist Yes but keep shifting to it whenever you are speaking about commands. so we don't need to keep in mind each word you say. It helps a lot in understanding the commands. And whomsoever is trying to follow can see and do in parallel. Thus a lot of other Tech & programming tubers do the same with their webcam on side.
      It's all about viewer perspective.
      Note: All of the above is in positive feedback. Keep making the good stufff :)

  • @BORCHLEO
    @BORCHLEO หลายเดือนก่อน

    Thank you matt! this is such am amazing way for new people to get into models with ollama! thank you for always making the best ollama content ever! have a good one!

  • @Kk-ed1gr
    @Kk-ed1gr หลายเดือนก่อน +1

    Thanks for sharing this breakthrough. Super helpful.

  • @Chris-Nienart
    @Chris-Nienart หลายเดือนก่อน

    Thank you for point out the caveats to the setup. I appreciate the time savings and not having to learn some of these lessons the hard way.
    Also, love the PSAs to stay hydrated. Reminds me of Bob Barker telling everyone to spay and neuter their pets.

  • @jidun9478
    @jidun9478 หลายเดือนก่อน

    This is a great start! That is the single biggest issue I have with ollama, it should not be so complicated to add a custom model in gguf format.

  • @leluch1616
    @leluch1616 หลายเดือนก่อน +1

    Matt, thank you for your videos and well explanations! greetings from Ecuador! I was able to build so much stuff thanks to you!

    • @technovangelist
      @technovangelist  หลายเดือนก่อน +2

      Ecuador. One of many places I would love to see. My only stops in South America have been in Venezuela, Argentina, and Uruguay.

  • @atom6_
    @atom6_ หลายเดือนก่อน +4

    if only ollama would add support for a mlx backend, text generation performance would go 2x on macs., while it is already quite good atm.

    • @electroheadfx
      @electroheadfx หลายเดือนก่อน

      oh ok it need to support MLX backend from Ollama core ?

    • @technovangelist
      @technovangelist  หลายเดือนก่อน

      2x? no. It is much faster than LM Studio could do before and because of supporting that they were able to catch up and go a touch faster, but then you have to deal with that disaster of a UI. It's questionable whether adding that backend would make much difference and it would be a lot of work.

  • @newjoker5123
    @newjoker5123 หลายเดือนก่อน

    I learned something new again, so its another great video. ty

  • @Igbon5
    @Igbon5 หลายเดือนก่อน

    Learning more thanks. I like motorcycle repair and maintenance too.

  • @wardehaj
    @wardehaj หลายเดือนก่อน

    Thanks for this great video!

  • @AliAlias
    @AliAlias หลายเดือนก่อน

    Nice future, I love ollama ❤

  • @tomwawer5714
    @tomwawer5714 หลายเดือนก่อน +1

    Now I wait for text2image in ollama

  • @dr_harrington
    @dr_harrington หลายเดือนก่อน +5

    Would be great if Ollama had llama 3.2 11B available. Can you ask your friends for an update on their progress?

    • @technovangelist
      @technovangelist  หลายเดือนก่อน +5

      they are still working on it. there is a reason no other runners have it either

    • @jossejosse952
      @jossejosse952 หลายเดือนก่อน

      And the model in GGUF?, if it’s not too much trouble, thanks in advance.

  • @chizzlemo3094
    @chizzlemo3094 หลายเดือนก่อน

    Great videos, thank you very much

  • @vickytube86
    @vickytube86 หลายเดือนก่อน

    Please create a video on changing context length in ollama... by default it is 2K only
    Also changing other parameters settings will be great.

    • @technovangelist
      @technovangelist  หลายเดือนก่อน

      There are a bunch on here that show that

  • @ywueeee
    @ywueeee หลายเดือนก่อน

    can you make a video on how to train on your own tweets and then generate bunch of tweets in your style after giving it some new context

  • @PerfectlyNormalBeast
    @PerfectlyNormalBeast หลายเดือนก่อน +1

    I think they're videos about ollama, but they might just be singing for my cat

  • @buildyear86
    @buildyear86 หลายเดือนก่อน

    Hi Matt! Thank you for your amazing educational content on AI - it's been a huge help. I'm building an AI agent in N8N on Linux and I'm curious about the practical differences between using NVidia GPUs and AMD GPUs with a Large Language Model like Llama. I've heard NVidia is superior, but what does this really mean in practice? Let's say compare an nvidia 4080 to a AMD 7900xt for example? Your insights would be incredibly valuable, and I'd be grateful if you could share your thoughts on this.

    • @buildyear86
      @buildyear86 หลายเดือนก่อน

      Asking because i would like to support AMD over its open source approach versus nvidia :)

    • @technovangelist
      @technovangelist  หลายเดือนก่อน +1

      High end nvidia is better than the best from amd but amd is always cheaper for comparable performance

    • @buildyear86
      @buildyear86 หลายเดือนก่อน

      Thank you. Always interested for a vid on stuff like this! Cheers

  • @NLPprompter
    @NLPprompter หลายเดือนก่อน

    i hope there will be feature to support token streaming model like kyutai moshi (they hasn't release any...) but it will be really cool if we have open source locally model that able do overlap conversation with local AI just like openai advance mode conversation do

  • @QorQar
    @QorQar หลายเดือนก่อน

    Thank you and a question, what if the model has several parts, does it support that?

  • @jinchoung
    @jinchoung หลายเดือนก่อน

    thanks ollama....

  • @Pure_Science_and_Technology
    @Pure_Science_and_Technology หลายเดือนก่อน +1

    Does Ollama have a gui?. lol later in the video you answered my question. 😊

    • @technovangelist
      @technovangelist  หลายเดือนก่อน +3

      Ollama is text based. There are many guis that run on top but few are as good as the text interface

  • @miloldr
    @miloldr หลายเดือนก่อน

    What is your opinion on nemotron 70b

  • @icpart
    @icpart หลายเดือนก่อน

    Which is that front end UI for Ollama in the video?

  • @volt5
    @volt5 หลายเดือนก่อน

    Hi Matt, I’ve been trying understand system prompts. I understand these to essentially be prepended to every user prompt. In this video it seems that some models are trained with particular system prompts. Can you suggest a good site/document to read up on this?

    • @technovangelist
      @technovangelist  หลายเดือนก่อน +1

      they aren't trained with system prompts necessarily, and they aren't prepended to every user prompt. If you are having a conversation with the model, every previous question and answer is added to a messages block. At the top of that is the system prompt. And then all of that is handed to the model, Otherwise the model has no memory of any conversation.

    • @volt5
      @volt5 หลายเดือนก่อน

      @@technovangelist I wrote a simple client using the REST chat API. The results are absolutely cool. Very nice API. Your videos are very helpful.

  • @desireco
    @desireco หลายเดือนก่อน +1

    If you import HuggingFace models in Ollama, they are usually beyond slow for some reasons, I think the nature of import just makes then use excessive resources, not the model size. So however interesting the model, it is just a hassle and not worth it. But let me give it a whirl just to make sure, maybe they fixed it.

    • @technovangelist
      @technovangelist  หลายเดือนก่อน +1

      not usually. they perform just as well if you get it from hf as if you get them from ollama

    • @desireco
      @desireco หลายเดือนก่อน

      @@technovangelist I am downloading one and will try it. I might have been unlucky with weird models in the past,, who knows.
      Thanks for covering this, this is really useful and I prefer Ollama just because I am used to it.

    • @desireco
      @desireco หลายเดือนก่อน +1

      Just to confirm that everything works well, I don't know why converting models in the past made them slow, definitely no longer the issue. Thanks again for great video.

  • @ashutoshanand7944
    @ashutoshanand7944 22 วันที่ผ่านมา

    Hi Matt, thank you so much for such great videos. Is there any way I can use the non-GGUF Hugging Face model in Ollama? I want to use the facebook/mbart model for my translation work, but unfortunately, I can't find a GGUF version of it. Additionally, could you please suggest the best model for translation work with the highest accuracy that I can use in Ollama?

    • @technovangelist
      @technovangelist  21 วันที่ผ่านมา +1

      I think mbart is a different architecture. But many PyTorch and other models can be converted. Review the import docs on the ollama docs

    • @ashutoshanand7944
      @ashutoshanand7944 21 วันที่ผ่านมา

      @technovangelist thank you

  • @mr.gk5
    @mr.gk5 29 วันที่ผ่านมา

    Hi do you have the video that elaborate on adding the ollama chat template to the hugging face models. I'm just one step away from getting it running -.-

    • @technovangelist
      @technovangelist  29 วันที่ผ่านมา

      I have a few that talk about creating the model files from a few months back. Not much has changed there. The new feature in that video was that a 5 min process is now a 30 second process. It’s a convenience.

    • @mr.gk5
      @mr.gk5 16 วันที่ผ่านมา

      @ some gguf llm are split in parts. How does it work if I want to create the model file? Am I supposed to merge them first or will it automatically detect?

  • @Shubham-rf2bs
    @Shubham-rf2bs 10 วันที่ผ่านมา

  • @QorQar
    @QorQar หลายเดือนก่อน

    it is supported saftensor model?

    • @learndatawithmark
      @learndatawithmark หลายเดือนก่อน

      You can't do safetensors directly like in this video. Ollama does support some of those models, but you have to use the Modelfile approach. I made a short video showing how to do it with one of the HF models - th-cam.com/video/DSLwboFJJK4/w-d-xo.html

  • @miloldr
    @miloldr 22 วันที่ผ่านมา

    Do you know when llama 4 will be a released?

    • @technovangelist
      @technovangelist  21 วันที่ผ่านมา

      Nope. Early next year? Late next year?

    • @miloldr
      @miloldr 21 วันที่ผ่านมา

      @technovangelist I can't wait that long :(

  • @mal-avcisi9783
    @mal-avcisi9783 หลายเดือนก่อน

    how do i download a different version of a gguf model ? often there are various quantization like in QuantFactory/Ministral-3b-instruct-GGUF, how do i download the particular version i want ?

    • @technovangelist
      @technovangelist  หลายเดือนก่อน +1

      Add the standard quant label as a tag

  • @electroheadfx
    @electroheadfx หลายเดือนก่อน

    is it possible to work with any MLX models for run on Apple Silicon faster on GPU ? like ML Studio know to do that

    • @technovangelist
      @technovangelist  หลายเดือนก่อน +1

      not yet. LM Studio added it recently which has allowed them to catch up to ollama and go past by a couple percent at most. I tried it last night and based on their claims expected mind blowing performance, but it's a tiny improvement over Ollama. Try it.

    • @electroheadfx
      @electroheadfx หลายเดือนก่อน

      @@technovangelist thanks for your echange and your videos, great work

  • @ghazanfarabidi4137
    @ghazanfarabidi4137 หลายเดือนก่อน

    Gotta pivot to Otiger

  • @vertigoz
    @vertigoz หลายเดือนก่อน

    Does it use GPU? I downloaded ministrar 8b and it seemed quite slow

    • @technovangelist
      @technovangelist  หลายเดือนก่อน

      if you have a recent gpu, ollama will support it

  • @kumaraswamypallukuri3570
    @kumaraswamypallukuri3570 หลายเดือนก่อน

    Nice video , can you download two models and run togather in ollama

    • @learndatawithmark
      @learndatawithmark หลายเดือนก่อน

      Yes, you can download as many models as you can fit on your machine. Ollama lets you load multiple of them in memory and run them in parallel too

  • @ОлегКонстантинович-г2ж
    @ОлегКонстантинович-г2ж หลายเดือนก่อน

    what is the name of gui?

    • @technovangelist
      @technovangelist  หลายเดือนก่อน

      I mentioned it. Pageassist. A chrome extension

  • @envoy9b9
    @envoy9b9 หลายเดือนก่อน

    Can i download a mlx model and run it on ollama with apple silicon?

    • @technovangelist
      @technovangelist  หลายเดือนก่อน

      You would use mlx for an mlx model.

  • @fabriai
    @fabriai หลายเดือนก่อน

    Are you a tiger whisperer?

  • @jsward17
    @jsward17 หลายเดือนก่อน

    Did you get kicked off the team?

    • @technovangelist
      @technovangelist  หลายเดือนก่อน

      its been answered a few times elsewhere on the channel. But there are lots of reasons folks don't stay at companies forever. And Ollama is just another company like any other.

  • @mal-avcisi9783
    @mal-avcisi9783 หลายเดือนก่อน

    I am so sick of the word model. I hear model model model .... My brain starts to get triggered of this word