Run Mistral, Llama2 and Others Privately At Home with Ollama AI - EASY!
ฝัง
- เผยแพร่เมื่อ 20 มิ.ย. 2024
- Self-hosting Ollama at home gives you privacy whilst using advanced AI tools. In this video I provide a quick tutorial on how to set this up via the CLI and Docker with a web GUI.
Ollama:
ollama.ai/
Video Instructions:
github.com/JamesTurland/JimsG...
Recommended Hardware: github.com/JamesTurland/JimsG...
Discord: / discord
Twitter: / jimsgarage_
Reddit: / jims-garage
GitHub: github.com/JamesTurland/JimsG...
00:00 - Overview of Ollama and LLMs
01:38 - Creating a VM
02:52 - Installation - CLI
05:50 - Installation - Docker
11:55 - Outro - วิทยาศาสตร์และเทคโนโลยี
I'm starting to think you are in my head. I've been looking to do this for about a week. Then (AGAIN!) you pop up with an excellent walk-thru.
Incredible
It was the AI, not me!
@@Jims-Garage "AGAIN!" means I guess that this happened before already. 1st time okay, it might be the AI but the 2nd time.. nah, that's shady dude it's a whole different story :D
Great walkthrough! Got it up and running in my homelab
Just a fantastic walk through! Thank you for being thorough on everything.
My pleasure!
Thanks Jim! I've been meaning to run AI on my own infra for a while. So hopefully your video will motivate me to actually go and do it.
You can do it! It's a pretty handy tool once it's up and running.
Thanks for the demo and info, this is awesome, and I think soon it will answer Jim's Garage. Happy Holidays, I will definitely use it in my home lab.
Thanks, Chris. Good to hear. Have a great Christmas break.
O nice! Wasn’t expecting a vid on this but I’m glad you made one.
Keep people on their toes
Loved the first question you asked Dolphin... it's funny, I've never heard of any of those 5 that it listed.
Thanks for the tutorial!
Glad to help! I haven't heard of them either - AI is known to sometimes make things up...
Man, I'm going to need to upgrade my home lab again.... This is exciting stuff.
It's super powerful 🤯
Jim slow down.. I can’t keep up with your videos. I still haven’t finished the last one.😂
You can do it! 😂
Just got this up and running on my server! Thanks for the video it was super easy. Put it on a VM with GPU passthrough, just a 2080 I had lying around. Had an issue with AVX so I changed the CPU to Host in proxmox. Not as fast as GPT but I like having my services local.
That's awesome, I agree. It is slower but who knows how many GPUs are processing your queries on the gpt side
Hi Jaymax what do you mean by changing CPU to Host? I tried to get Ollama running on my proxmox server for weeks. Tried a Debian container, a Ubuntu Server VM and a Win11 VM with Docker plugin always with the same result. I have fairly old hardware HP Z800 with DDR3 and an X5690 so I imagine the hardware restricts my success in getting this to run.
@@martinzipfel7843 in proxmox, go to hardware options for the vm, click the CPU, edit the type to host
@@martinzipfel7843 Yup what Jim said. In proxmox, VM> Hardware>Processor>Type=Host. That said I had the error on my VM with AVX on the proxmox Type=x86-64 V2 changed it to host and no issues.
Was able to get it working in VS code using the continue extension! Had to tweak the docker-compose file to expose port 11434 and then updated the config.json in the extension to point to my server!
Awesome 😎
Well, now you have done a video on it, I will have to try it. Probably won't use it, but it's fun. Not sure I have the specs to do it, but I can try.
Some of the smaller ones are okay, they'll just be slow.
Thinking of some fun use cases. Integration with home assistant to build out your own voice assistant would be awesome.
That would be cool!
Thanks for showing us how easy it is to get started with these LLMs.
The only downside is that now I need to budget for an nvidia card in my Xmas shopping list 😛
Haha 😂 it's ok with CPU, but definitely not the fastest
Jim, fantastic and simple. Thank you for that. Is there any way to do the same but with AI that can creates images? Something like Dalee or midjurney? I wonder if there is a way of hosting that on my own hardware and having that at home just for myself 😉
You'll have to wait and see 😉 (spoiler: yes!)
@@Jims-Garage Great 🙂In that case, I'll be patient, Thank you!
Ahoi Jim what do you think about your next video from urbackup running in a docker Container on synology nas with reverse proxy😊
Sorry, what do you mean? Restoring a backup?
Perfect, this is what I was looking for, do we need to do PCIe pass through for the GPU in Proxmox? Have you a video for this?
Yes, you'll need PCIe passthrough for GPU in a VM, or you could use in an LXC. Check my GPU video with baldurs gate pic, or LXC
@@Jims-Garage Thanks, joining your channel in 1,2.... hahaah
@@GeekendZone thanks for your support
Hey Jim, thanks for this video. Me and a friend of mine were talking about this yesterday. AI is listening...LOL But hey, are you doing video passthrough? I have the package installed in Proxmox like you mention at first but the it looks like it is looking for a GPU anyway. When I try to run the model I get "Error: llama runner process has terminated"
Unfortunately I don't have an Nvidia card to passthrough. As soon as intel support is available I'll be passing my arc in to see what that does.
I have another workstation that so I going to install Ubuntu 22.04 on and go that route. I think that will work better for me.
I was able to get Ubuntu 22.04 installed and it loads the models now, when I run a query it says “Error: llama runner exited, you may not have enough available memory to run this model” I have 80Gig or Ram installed.
Mistral run fine. Seems like some of the others may require higher VRAM. I only have 8Gig on my video card.
@@kf4bzt I get exactly the same error ("Error: llama runner process has terminated") what did you do to get rid of it?
Works great in WSL2 with GPU acceleration, unfortunaltey no GPU in my proxmox to test.
Good to know. That's probably the best way for most people to test it.
Since about week Ollama has ROCm support so it will work on AMD GPUs like RDNA2 or RDNA3 architecture, that means all 6000 and 7000 cards
That's a great update 🙂
@@Jims-Garage I'm going to buy new AMD GPU soon, I will test this out 😉
Will _ANY_ nvidia GPU do? I have an old spare basic one.
I'm not sure. How old is your GPU?
Would it be necessary to get a GPU for each Project?;
- LLama
- Virtual Workstation
- jellyfin
- Photo prism
- Automatic Ripping Machine aka A.R.M
Ect. ...
It would be interesting to know how to use. Just one for all of that...
If you're running all on the same Docker host then all containers can share the GPU.
I’m still looking to find an easy way to install llm with a gui that gets its information from my documents and can be corrected when it gives wrong answers. (My documents contain proprietary instructions/troubleshooting techniques)
That sounds like the dream. Not aware of anything that sophisticated at present.
IMPORTANT!!
the processor of the VM of type host and the numa enabled or you get a "signal: illegal instruction (core dumped)" error
Numa is for dual socket only
Man Im suprised what a 8ct16 (ryzen 7 5700x) can do just by bruteforcing. its pretty fast too
Agreed. The CPU is fine for a homelab in most cases.
@@Jims-Garage just started looking deeper into passing the gpu through for the llm and for transcoding video. its not that easy to do but still fun! thank you!
What actual useful things have anyone done with these models? They seem *potentially* useful but I haven't yet found anything actually useful yet
I lean that way at the moment. I can never fully trust the output. I'm sure it'll improve over time.
followed the guide. downloaded same model. but get "Uh-oh! There was an issue connecting to Ollama." both containers are running.
server runs 60gig ram 32cores @ 2.999GHz
did the git clone
Try reupping the container
guess i found out.
on proxmox i had kvm cores. used host after several reboots. now seems to work.
@@hawolex2341 great 👍
I was expecting how to generate uncensored manga 😂
Stay tuned... ;)
first :D
Actually been sitting on needles waiting for this one :D Good one mate, getting this one out so fast - super excited and start looking right away
You're welcome. Just in time for your new hardware...