Unlock the Power of AI with Ollama and Hugging Face

แชร์
ฝัง
  • เผยแพร่เมื่อ 26 ธ.ค. 2024

ความคิดเห็น • 84

  • @beachfeet6055
    @beachfeet6055 2 หลายเดือนก่อน +27

    Matt, for each GGUF model listed on HuggingFace there is a black "Use This Model" button. This opens a drop down of providers. Ollama is listed. Clicking that gives the whole "ollama run" command with URL for the model metadata. Also on the right side of each page are links for various Quant sizes. Each of these also has the "Use This Model" button. Pretty handy!

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน +11

      Nice. Another new thing. For a long time it felt like ollama was intentionally left out of that list. Thanks for pointing this out

    • @jimholmes692
      @jimholmes692 2 หลายเดือนก่อน

      Not all models have the button though

  • @JJJJ-r3u
    @JJJJ-r3u 2 หลายเดือนก่อน +25

    Whenever there is a command, I would hope to see a terminal with the command on the screen. It is easier to remember if one can see than just hear it.

    • @thenaman047
      @thenaman047 2 หลายเดือนก่อน +1

      @@JJJJ-r3u agreee

    • @SubhojeetNeogy
      @SubhojeetNeogy 2 หลายเดือนก่อน +1

      @@JJJJ-r3u true

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน +1

      Great. That’s why I showed it the first few times

    • @thenaman047
      @thenaman047 2 หลายเดือนก่อน +5

      @@technovangelist Yes but keep shifting to it whenever you are speaking about commands. so we don't need to keep in mind each word you say. It helps a lot in understanding the commands. And whomsoever is trying to follow can see and do in parallel. Thus a lot of other Tech & programming tubers do the same with their webcam on side.
      It's all about viewer perspective.
      Note: All of the above is in positive feedback. Keep making the good stufff :)

  • @RasmusRasmussen
    @RasmusRasmussen 2 หลายเดือนก่อน +1

    Fantastic news! Of course, I immediately checked it on OpenWeb-UI and had no problem loading one of my experimental huggingface models from the web interface. Very cool.

  • @Kk-ed1gr
    @Kk-ed1gr 2 หลายเดือนก่อน +1

    Thanks for sharing this breakthrough. Super helpful.

  • @BORCHLEO
    @BORCHLEO 2 หลายเดือนก่อน

    Thank you matt! this is such am amazing way for new people to get into models with ollama! thank you for always making the best ollama content ever! have a good one!

  • @leluch1616
    @leluch1616 2 หลายเดือนก่อน +1

    Matt, thank you for your videos and well explanations! greetings from Ecuador! I was able to build so much stuff thanks to you!

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน +2

      Ecuador. One of many places I would love to see. My only stops in South America have been in Venezuela, Argentina, and Uruguay.

  • @aristotelesfernando
    @aristotelesfernando หลายเดือนก่อน

    Thanks Matt! Another very interesting video

  • @jidun9478
    @jidun9478 2 หลายเดือนก่อน

    This is a great start! That is the single biggest issue I have with ollama, it should not be so complicated to add a custom model in gguf format.

  • @Lieblingszuschauer
    @Lieblingszuschauer หลายเดือนก่อน +1

    Thanks for recommenting the Ollama Chrome Extension. It makes life easier. Maybe you can explain how to find great models on HuggingFaces. I just downloaded the famous classic models and have no idea how to benefit from this huge database of AI stuff. Finding your video I first thought, this video brings the answer how to find in HF the right models.

  • @Chris-Nienart
    @Chris-Nienart 2 หลายเดือนก่อน

    Thank you for point out the caveats to the setup. I appreciate the time savings and not having to learn some of these lessons the hard way.
    Also, love the PSAs to stay hydrated. Reminds me of Bob Barker telling everyone to spay and neuter their pets.

  • @atom6_
    @atom6_ 2 หลายเดือนก่อน +4

    if only ollama would add support for a mlx backend, text generation performance would go 2x on macs., while it is already quite good atm.

    • @electroheadfx
      @electroheadfx 2 หลายเดือนก่อน

      oh ok it need to support MLX backend from Ollama core ?

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน

      2x? no. It is much faster than LM Studio could do before and because of supporting that they were able to catch up and go a touch faster, but then you have to deal with that disaster of a UI. It's questionable whether adding that backend would make much difference and it would be a lot of work.

  • @dr_harrington
    @dr_harrington 2 หลายเดือนก่อน +5

    Would be great if Ollama had llama 3.2 11B available. Can you ask your friends for an update on their progress?

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน +5

      they are still working on it. there is a reason no other runners have it either

    • @jossejosse952
      @jossejosse952 2 หลายเดือนก่อน

      And the model in GGUF?, if it’s not too much trouble, thanks in advance.

  • @tomwawer5714
    @tomwawer5714 2 หลายเดือนก่อน +1

    Now I wait for text2image in ollama

  • @newjoker5123
    @newjoker5123 2 หลายเดือนก่อน

    I learned something new again, so its another great video. ty

  • @Igbon5
    @Igbon5 2 หลายเดือนก่อน

    Learning more thanks. I like motorcycle repair and maintenance too.

  • @wardehaj
    @wardehaj 2 หลายเดือนก่อน

    Thanks for this great video!

  • @icpart
    @icpart 2 หลายเดือนก่อน

    Which is that front end UI for Ollama in the video?

  • @chizzlemo3094
    @chizzlemo3094 2 หลายเดือนก่อน

    Great videos, thank you very much

  • @AliAlias
    @AliAlias 2 หลายเดือนก่อน

    Nice future, I love ollama ❤

  • @miloldr
    @miloldr 2 หลายเดือนก่อน

    What is your opinion on nemotron 70b

  • @NLPprompter
    @NLPprompter 2 หลายเดือนก่อน

    i hope there will be feature to support token streaming model like kyutai moshi (they hasn't release any...) but it will be really cool if we have open source locally model that able do overlap conversation with local AI just like openai advance mode conversation do

  • @QorQar
    @QorQar 2 หลายเดือนก่อน

    Thank you and a question, what if the model has several parts, does it support that?

  • @vickytube86
    @vickytube86 2 หลายเดือนก่อน

    Please create a video on changing context length in ollama... by default it is 2K only
    Also changing other parameters settings will be great.

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน

      There are a bunch on here that show that

  • @buildyear86
    @buildyear86 2 หลายเดือนก่อน

    Hi Matt! Thank you for your amazing educational content on AI - it's been a huge help. I'm building an AI agent in N8N on Linux and I'm curious about the practical differences between using NVidia GPUs and AMD GPUs with a Large Language Model like Llama. I've heard NVidia is superior, but what does this really mean in practice? Let's say compare an nvidia 4080 to a AMD 7900xt for example? Your insights would be incredibly valuable, and I'd be grateful if you could share your thoughts on this.

    • @buildyear86
      @buildyear86 2 หลายเดือนก่อน

      Asking because i would like to support AMD over its open source approach versus nvidia :)

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน +1

      High end nvidia is better than the best from amd but amd is always cheaper for comparable performance

    • @buildyear86
      @buildyear86 2 หลายเดือนก่อน

      Thank you. Always interested for a vid on stuff like this! Cheers

  • @mr.gk5
    @mr.gk5 2 หลายเดือนก่อน

    Hi do you have the video that elaborate on adding the ollama chat template to the hugging face models. I'm just one step away from getting it running -.-

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน

      I have a few that talk about creating the model files from a few months back. Not much has changed there. The new feature in that video was that a 5 min process is now a 30 second process. It’s a convenience.

    • @mr.gk5
      @mr.gk5 หลายเดือนก่อน

      @ some gguf llm are split in parts. How does it work if I want to create the model file? Am I supposed to merge them first or will it automatically detect?

  • @miloldr
    @miloldr หลายเดือนก่อน

    Do you know when llama 4 will be a released?

    • @technovangelist
      @technovangelist  หลายเดือนก่อน

      Nope. Early next year? Late next year?

    • @miloldr
      @miloldr หลายเดือนก่อน

      @technovangelist I can't wait that long :(

  • @volt5
    @volt5 2 หลายเดือนก่อน

    Hi Matt, I’ve been trying understand system prompts. I understand these to essentially be prepended to every user prompt. In this video it seems that some models are trained with particular system prompts. Can you suggest a good site/document to read up on this?

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน +1

      they aren't trained with system prompts necessarily, and they aren't prepended to every user prompt. If you are having a conversation with the model, every previous question and answer is added to a messages block. At the top of that is the system prompt. And then all of that is handed to the model, Otherwise the model has no memory of any conversation.

    • @volt5
      @volt5 2 หลายเดือนก่อน

      @@technovangelist I wrote a simple client using the REST chat API. The results are absolutely cool. Very nice API. Your videos are very helpful.

  • @ashutoshanand7944
    @ashutoshanand7944 หลายเดือนก่อน

    Hi Matt, thank you so much for such great videos. Is there any way I can use the non-GGUF Hugging Face model in Ollama? I want to use the facebook/mbart model for my translation work, but unfortunately, I can't find a GGUF version of it. Additionally, could you please suggest the best model for translation work with the highest accuracy that I can use in Ollama?

    • @technovangelist
      @technovangelist  หลายเดือนก่อน +1

      I think mbart is a different architecture. But many PyTorch and other models can be converted. Review the import docs on the ollama docs

    • @ashutoshanand7944
      @ashutoshanand7944 หลายเดือนก่อน

      @technovangelist thank you

  • @QorQar
    @QorQar 2 หลายเดือนก่อน

    it is supported saftensor model?

    • @learndatawithmark
      @learndatawithmark 2 หลายเดือนก่อน

      You can't do safetensors directly like in this video. Ollama does support some of those models, but you have to use the Modelfile approach. I made a short video showing how to do it with one of the HF models - th-cam.com/video/DSLwboFJJK4/w-d-xo.html

  • @vertigoz
    @vertigoz 2 หลายเดือนก่อน

    Does it use GPU? I downloaded ministrar 8b and it seemed quite slow

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน

      if you have a recent gpu, ollama will support it

  • @ywueeee
    @ywueeee 2 หลายเดือนก่อน

    can you make a video on how to train on your own tweets and then generate bunch of tweets in your style after giving it some new context

  • @mal-avcisi9783
    @mal-avcisi9783 2 หลายเดือนก่อน

    how do i download a different version of a gguf model ? often there are various quantization like in QuantFactory/Ministral-3b-instruct-GGUF, how do i download the particular version i want ?

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน +1

      Add the standard quant label as a tag

  • @desireco
    @desireco 2 หลายเดือนก่อน +1

    If you import HuggingFace models in Ollama, they are usually beyond slow for some reasons, I think the nature of import just makes then use excessive resources, not the model size. So however interesting the model, it is just a hassle and not worth it. But let me give it a whirl just to make sure, maybe they fixed it.

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน +1

      not usually. they perform just as well if you get it from hf as if you get them from ollama

    • @desireco
      @desireco 2 หลายเดือนก่อน

      @@technovangelist I am downloading one and will try it. I might have been unlucky with weird models in the past,, who knows.
      Thanks for covering this, this is really useful and I prefer Ollama just because I am used to it.

    • @desireco
      @desireco 2 หลายเดือนก่อน +1

      Just to confirm that everything works well, I don't know why converting models in the past made them slow, definitely no longer the issue. Thanks again for great video.

  • @electroheadfx
    @electroheadfx 2 หลายเดือนก่อน

    is it possible to work with any MLX models for run on Apple Silicon faster on GPU ? like ML Studio know to do that

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน +1

      not yet. LM Studio added it recently which has allowed them to catch up to ollama and go past by a couple percent at most. I tried it last night and based on their claims expected mind blowing performance, but it's a tiny improvement over Ollama. Try it.

    • @electroheadfx
      @electroheadfx 2 หลายเดือนก่อน

      @@technovangelist thanks for your echange and your videos, great work

  • @ОлегКонстантинович-г2ж
    @ОлегКонстантинович-г2ж 2 หลายเดือนก่อน

    what is the name of gui?

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน

      I mentioned it. Pageassist. A chrome extension

  • @Pure_Science_and_Technology
    @Pure_Science_and_Technology 2 หลายเดือนก่อน +1

    Does Ollama have a gui?. lol later in the video you answered my question. 😊

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน +3

      Ollama is text based. There are many guis that run on top but few are as good as the text interface

  • @jinchoung
    @jinchoung 2 หลายเดือนก่อน

    thanks ollama....

  • @envoy9b9
    @envoy9b9 2 หลายเดือนก่อน

    Can i download a mlx model and run it on ollama with apple silicon?

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน

      You would use mlx for an mlx model.

  • @PerfectlyNormalBeast
    @PerfectlyNormalBeast 2 หลายเดือนก่อน +1

    I think they're videos about ollama, but they might just be singing for my cat

  • @kumaraswamypallukuri3570
    @kumaraswamypallukuri3570 2 หลายเดือนก่อน

    Nice video , can you download two models and run togather in ollama

    • @learndatawithmark
      @learndatawithmark 2 หลายเดือนก่อน

      Yes, you can download as many models as you can fit on your machine. Ollama lets you load multiple of them in memory and run them in parallel too

  • @ghazanfarabidi4137
    @ghazanfarabidi4137 2 หลายเดือนก่อน

    Gotta pivot to Otiger

  • @jsward17
    @jsward17 2 หลายเดือนก่อน

    Did you get kicked off the team?

    • @technovangelist
      @technovangelist  2 หลายเดือนก่อน

      its been answered a few times elsewhere on the channel. But there are lots of reasons folks don't stay at companies forever. And Ollama is just another company like any other.

  • @Shubham-rf2bs
    @Shubham-rf2bs หลายเดือนก่อน

  • @fabriai
    @fabriai 2 หลายเดือนก่อน

    Are you a tiger whisperer?

  • @mal-avcisi9783
    @mal-avcisi9783 2 หลายเดือนก่อน

    I am so sick of the word model. I hear model model model .... My brain starts to get triggered of this word