Love this, I think we’ve all been doing just-in-time learning to run and keep up to date with what’s happening every couple of weeks. Great to tear it back to the foundations Matt
Man. I love it. I already subscribe to it. Something I really will be crazy to know is how to story my ollama local models in an external hard drive in Mac. As you know macs doesn't have much space. So, i bought a special hard drive that runs at 40G/sec to have models and other stuff and I will love to have the models in there than in my internal hard drive. Thanks for the great content and explanations.
Hey Matt! Off topic comment but I guess I'm feeding the ol' TH-cam algorithm anyway! I haven't watched your entire backlog so apologies if you've already covered this, but I'd love to see some content / videos on the following topics: 1. How can you use ollama in a production environment. Topics around infrastructure, reasonable techniques (e.g. handing off processing to async jobs when possible), cost, etc. I'm not sure how common this use case is but I am evaluating using something like llama 3.1 to help summarize some potentially very large text files and weighing cost differences between using something turnkey like openai's APIs vs figuring out hosting myself (well my company. There seems to be a lot less on production hardening some of these open source models (or I just haven't been paying attention!) 2. A "state of the union" high level overview of the options available to software developer new to using AI. This you have covered in a lot more detail in various forms, but an overview of what tools are actually at a persons disposal in terms of trying to use AI to solve some problem. When I first started looking at this stuff I thought the only options I had were buying a bunch of super computers to train models and learning a lot about doing matrix multiplication. But we have RAG, we have "fine tuning", we have modifying system prompts... a sort of high level overview of what a layperson can do, and perhaps where reasonable off-ramps for more advance use cases are would be super helpful (i.e. when do I need to brunch up on my linear algebra? :)) Thanks for your work!
I’m trying to use ollama serve to integrate my app with ollama however a lot of functionality is not working when using ollama serve for example I can list pull and rm models using serve but when I try to load a model to memory via API or just run model in terminal it crashes base on the log it just says ollama crashes and restarted, for now I’m using the ollama app.exe to start the full application to unblock development but I can’t find any documentation to help troubleshoot the issue, I would really appreciate if you can have a video just about using ollama serve command and it’s how you’d use it
ah... the ollsma serve... LOL i was wasted a week until i realized it was user issue in Linux, i felt so stupid having duplicate models, and things... this is really good video any one new to ollama should watch this if i watch this before i wouldn't waste a week just to realized how stupid i am the simple user issue...
I have a noob question. If anybody can upload a model on Ollama, is it possible for a malicious user to upload malware disguised as a model? And are there measures to prevent such a scenario.
Hi Matt, thank you for another amazing content. I'm working with ollama and other tools available from community to develop some solutions for my company. I need some help from a professional consultant for this job. Could you work with me, or, maybe, recommend a person who can help me to do it?
I wouldn’t recommend creating models the legacy Q4_0 quant types, they’re depreciated and are worse quality than K quants (or IQ if you’re running with CUDA)
Removing models is the most annoying part because you have to name it exact. Wish they made it easier to just select and delete via GUI or list and select to remove by a number
Love this, I think we’ve all been doing just-in-time learning to run and keep up to date with what’s happening every couple of weeks. Great to tear it back to the foundations Matt
Exactly what I was looking for! THANK YOU!
Thank you for this awesome course, I‘m enjoying it!
for those using mac, type nano modelfile in the command line to create the modelfile
I'm really enjoying this series. Thanks.
I love the way you explain. Thanks
Looking forward for your next video
and here it is.... comparing quantizations
Wonderful video, Matt. Thanks so much for sharing this.
Excellent content Matt! Congrats! Keep on going.
I love your videos! Your explanations are amazing, thank you!
this is amazing, super clear, thank you!
Very nice! Thank you.
Man. I love it. I already subscribe to it. Something I really will be crazy to know is how to story my ollama local models in an external hard drive in Mac. As you know macs doesn't have much space. So, i bought a special hard drive that runs at 40G/sec to have models and other stuff and I will love to have the models in there than in my internal hard drive. Thanks for the great content and explanations.
They don't have much space?Sure they do. Mine has 4TB inside.
But you can use the OLLAMA_MODELS environment variable to start storing them elsewhere.
@@technovangelist Would you plan to do a video about it?
Hey Matt! Off topic comment but I guess I'm feeding the ol' TH-cam algorithm anyway!
I haven't watched your entire backlog so apologies if you've already covered this, but I'd love to see some content / videos on the following topics:
1. How can you use ollama in a production environment. Topics around infrastructure, reasonable techniques (e.g. handing off processing to async jobs when possible), cost, etc. I'm not sure how common this use case is but I am evaluating using something like llama 3.1 to help summarize some potentially very large text files and weighing cost differences between using something turnkey like openai's APIs vs figuring out hosting myself (well my company. There seems to be a lot less on production hardening some of these open source models (or I just haven't been paying attention!)
2. A "state of the union" high level overview of the options available to software developer new to using AI. This you have covered in a lot more detail in various forms, but an overview of what tools are actually at a persons disposal in terms of trying to use AI to solve some problem. When I first started looking at this stuff I thought the only options I had were buying a bunch of super computers to train models and learning a lot about doing matrix multiplication. But we have RAG, we have "fine tuning", we have modifying system prompts... a sort of high level overview of what a layperson can do, and perhaps where reasonable off-ramps for more advance use cases are would be super helpful (i.e. when do I need to brunch up on my linear algebra? :))
Thanks for your work!
More cool stuff please!
Thank you!
Thanks
Thanks so much for that
I’m trying to use ollama serve to integrate my app with ollama however a lot of functionality is not working when using ollama serve for example I can list pull and rm models using serve but when I try to load a model to memory via API or just run model in terminal it crashes base on the log it just says ollama crashes and restarted, for now I’m using the ollama app.exe to start the full application to unblock development but I can’t find any documentation to help troubleshoot the issue, I would really appreciate if you can have a video just about using ollama serve command and it’s how you’d use it
Please share the link of the video for reducing the model size for specific tasks, example, only weather, is wouldn't need the hole context for this
You would be able to fine tune for that but it wouldn"t reduce the size. Reducing the size would be a very expensive process.
ah... the ollsma serve... LOL i was wasted a week until i realized it was user issue in Linux, i felt so stupid having duplicate models, and things... this is really good video any one new to ollama should watch this if i watch this before i wouldn't waste a week just to realized how stupid i am the simple user issue...
thanks, m :)
I have a noob question. If anybody can upload a model on Ollama, is it possible for a malicious user to upload malware disguised as a model? And are there measures to prevent such a scenario.
There would be no way to do that.
Hi Matt, thank you for another amazing content.
I'm working with ollama and other tools available from community to develop some solutions for my company.
I need some help from a professional consultant for this job.
Could you work with me, or, maybe, recommend a person who can help me to do it?
I wouldn’t recommend creating models the legacy Q4_0 quant types, they’re depreciated and are worse quality than K quants (or IQ if you’re running with CUDA)
Removing models is the most annoying part because you have to name it exact. Wish they made it easier to just select and delete via GUI or list and select to remove by a number
That’s one reason I love gollama. I have a video about it.
what location to run that download hugging face model command? and where does it download to? same location as the others wheres that?
any1 wana swap code for tokens?