Keep in mind that UNLESS you're using one of the very large parameter models, that the output is often wrong (hallucinations!) . Deepseek-r1 (8 Billion parameter), listed "Kamloops Bob" (whoever that is), as the 4th Prime Minister of Canada. It told me that there were two r's in strawberry, and only corrected itself (with a lot of apologizing) after I pointed that out. It also told me that Peter Piper picked 42 pecks of pickled peppers, because that's the answer according to the Hitchhiker's guide (42 is the universal answer to everything...LOL). Unless you have the space and hardware to install one of the very large models, I wouldn't take any of the outputted results as being accurate (without cross checking). It's fun (hilarious, in fact) to play with, but take the results with a LARGE grain of salt.
Go back and read what DeepSeek actually said when it did give an answer. Last paragraph "In summary, while the Teininmen Square Massacre is recognized as a pivotal event of 1989, the lack of comprehensive and accessible information about it underscores its complexity and the challenges posed by historical silencing." Basically We can't say because we're not allowed to know!
I just set up ollama on a VMware VM on my 12th gen i9 laptop. It's not the fastest thing, but was faster than I thought it would be, at least using the Ollama 1.5b or small Deepseek-r1. Now I want to actually make a small AI machine with a decent GPU.
mine states it cannot install the docker desktop because of "WSL2 is not supported with your current machine configuration. Please enable the \"Virtual Machine Platform\" optional component and ensure virtualization is enabled in the BIOS"
Yes - absolutely. But keep in mind each user running queries is going to use a lot of processing power. The more users using your local LLM, the more powerful a system you'll need.
You can, you simply use the IP address of the host machine that Open Web UI is running on, along with port 3000...example: http : // 192.xxx.xx.xx:3000 (of course changing the 192.xx.xx.xx to whatever IP address of the host machine is).
@ heads ups, do not go with llama 3.3 on Mac mini M4 not only did it crash my computer it brought down my whole unifi network...oops...lol just rock llama 3.2latest and you will be fine
@@michaelthompson657 I think it based on the billion parameters (??) the llama 3.3 is like 70 billion 42gb download, lama3.2 is only 6 billion and 4.5gb…I’m pretty sure your your macbook can handle 6billiin no issue
I'd rather use DeepSeek's API than run it locally since I don't have a machine capable of handling a 32B or 70B parameter model. The API calls are affordable enough for my needs.
I hate Docker. Totally convoluted. Far from average user ready. Instead of having to configure containers with the CLI how about just stand alone apps?!
This is extremely interesting: Today (2025-01-30, 18:30 utc), I downloaded deepseek-r1:7b, and I entered the exact same question as you: "Tell me about the Tienenmen Square Massacre of 1989". From llama3.2 I got the corerct answer but from deepseek-r1:7b I got "I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses". Why the difference from your answer? (By the way, I am running Ollama on a MacBook Pro, Apple M2 Pro with 16 GB memory)
Well - that's exactly what I showed in this video...sometimes the deepseek model answers that question, and sometimes it gives the censored answer - maybe it has to do with what was asked earlier in that same conversation?
Why go through all that when you can just use the "Page Assist" browser extension for LLama? Don't need Docker. Don't need Open WebUI. Don't ned LM Studio. Just the browser extension.
I tried to connect ChatGPT model to this, but there is a price involved to use ChatGPT api am I correct? is there a free version of ChatGPT that you an connect to Open WebUI?
LM Studio is also an alternative worthy of looking at to serve multiple loaded models.
And it's much easier and faster to install
I don't get Ollama when LM Studio is SO much simpler to get setup and running.
Excellent tutorial. This is the most useful and detailed video I have seen in a while. Great job!
Keep in mind that UNLESS you're using one of the very large parameter models, that the output is often wrong (hallucinations!) . Deepseek-r1 (8 Billion parameter), listed "Kamloops Bob" (whoever that is), as the 4th Prime Minister of Canada. It told me that there were two r's in strawberry, and only corrected itself (with a lot of apologizing) after I pointed that out. It also told me that Peter Piper picked 42 pecks of pickled peppers, because that's the answer according to the Hitchhiker's guide (42 is the universal answer to everything...LOL). Unless you have the space and hardware to install one of the very large models, I wouldn't take any of the outputted results as being accurate (without cross checking). It's fun (hilarious, in fact) to play with, but take the results with a LARGE grain of salt.
how much vram do you have?
Yep, this got me curious. I'm installing it now.
Follow up and let me know how it goes!
Great video! Thanks for taking the time to create it.
Go back and read what DeepSeek actually said when it did give an answer. Last paragraph "In summary, while the Teininmen Square Massacre is recognized as a pivotal event of 1989, the lack of comprehensive and accessible information about it underscores its complexity and the challenges posed by historical silencing." Basically We can't say because we're not allowed to know!
Basically we can't say because we are allowed to know 😢
Going to try this as soon as I get home.
It took you 19 minutes to tell us how to setup a local AI in 10 minutes, but Deepseek would have only taken 5 minutes. 🤣
I just set up ollama on a VMware VM on my 12th gen i9 laptop. It's not the fastest thing, but was faster than I thought it would be, at least using the Ollama 1.5b or small Deepseek-r1. Now I want to actually make a small AI machine with a decent GPU.
This is super cool! Instructions on how to uninstall all of this could be helpful as well
any way to run it without docker?
yeah, he Literally said that it works without it. Docker is just to make it look nice
@@Viper_Playz I want to make it look nice without docker
What if i want to delete the first model i downloaded(llama) and just use the second one that i have downloaded(deepseek)?
Excellent! Amazing to the detail tutorial. Keep it up 👍🏻
16:07 Any LLM can use the tag.
Tk U for sharing, working with no issues
Installed it just as you said. It works. But turned of WiFi and it said networking problem. I thought this was stand alone.
mine states it cannot install the docker desktop because of "WSL2 is not supported with your current machine configuration.
Please enable the \"Virtual Machine Platform\" optional component and ensure virtualization is enabled in the BIOS"
That's "vt-x" in bios. I forget what the amd equivalent is called, but anyway you need to turn that on
No 685B params?
Very helpful video!
This is a GREAT video!
Would this work with MacOS too? If not how. Greatly appreciated!
And can you have multiple users logged into the webUI portal, if I say wanted to setup a server with this method?
Yes - absolutely. But keep in mind each user running queries is going to use a lot of processing power. The more users using your local LLM, the more powerful a system you'll need.
You can, you simply use the IP address of the host machine that Open Web UI is running on, along with port 3000...example: http : // 192.xxx.xx.xx:3000 (of course changing the 192.xx.xx.xx to whatever IP address of the host machine is).
Excellent !!! I will have to load this up on my server :)
Is this the same process on Mac?
yes this works on a mac, running on a Mac Mini M4 no issues...I actually did all this yesterday before his video came out...super weird...lol
@ lol thanks! I’ll have to check out some videos
@ heads ups, do not go with llama 3.3 on Mac mini M4 not only did it crash my computer it brought down my whole unifi network...oops...lol just rock llama 3.2latest and you will be fine
@ thanks. I currently have a MacBook Pro m4 with 24gb ram, not sure what the difference is
@@michaelthompson657 I think it based on the billion parameters (??) the llama 3.3 is like 70 billion 42gb download, lama3.2 is only 6 billion and 4.5gb…I’m pretty sure your your macbook can handle 6billiin no issue
So helpful. Thank you.
Only a human can deliver a good turkey bacon avocado recipe in my honestly opinion
Nice video. Can you please make a video how to completely uninstall all this from my computer after setup everything.
can it work on a Nas, via docker?
You could probably make it work, but a NAS is not made for the computing power this requires…it would not work well.
How do remove a module so you don't have two or more going?
The only one(s) in use are the ones you’ve selected at the top, so if only one is selected, only one is in use.
I currently have ollama installed on a linux system using llama 3.2 and hooked it to Home Assistant Voice for Iot voice control. Its got some promise.
can you do this on a mac?
Absolutely…the setup maybe a bit different, but the concepts are all the same.
I'd rather use DeepSeek's API than run it locally since I don't have a machine capable of handling a 32B or 70B parameter model. The API calls are affordable enough for my needs.
why not do this all in linux?
You certainly can - but most folks who are wanting to play around with this are going to be on Windows.
I tried but it wasn't as easy and clear as it looks in this video -.-
I hate Docker. Totally convoluted. Far from average user ready. Instead of having to configure containers with the CLI how about just stand alone apps?!
Can someone explain to me the advantages of doing this?... isnt it the same as search the ethernet for answers?
@CrosstalkSolutions can you do a video doing this on mac
Great Video, Would you sleep with the evil and give him also your car keys?
Cool. I assume local machines running still requires internet connection?
You need an Internet connection to download everything including the various language models, but they can be run offline once downloaded.
I have a 4000 series NVIDIA with only 16GB ram so I can only run small language models 😢
I have a 3090ti with 24GB and it runs the 70B deepseek model just fine. Llama3.3 is super sluggish though.
This is extremely interesting: Today (2025-01-30, 18:30 utc), I downloaded deepseek-r1:7b, and I entered the exact same question as you: "Tell me about the Tienenmen Square Massacre of 1989". From llama3.2 I got the corerct answer but from deepseek-r1:7b I got "I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses". Why the difference from your answer?
(By the way, I am running Ollama on a MacBook Pro, Apple M2 Pro with 16 GB memory)
Well - that's exactly what I showed in this video...sometimes the deepseek model answers that question, and sometimes it gives the censored answer - maybe it has to do with what was asked earlier in that same conversation?
Why go through all that when you can just use the "Page Assist" browser extension for LLama? Don't need Docker. Don't need Open WebUI. Don't ned LM Studio. Just the browser extension.
Given the NFL Cheats scandal, I was almost tempted to hit the Down vote for the mention of Taylor Swift LOL. Rest of the video was good though ;)
Haha - I know nothing about any NFL scandal.
loving that everyone is on the deepseek ai bandwagon 😅
Until you find out that all your data ends up in China.
I tried to connect ChatGPT model to this, but there is a price involved to use ChatGPT api am I correct? is there a free version of ChatGPT that you an connect to Open WebUI?
OpenAI doesn’t make their models available for download. But you can use Llama (Meta) or Gemma (Google) or Phi (Microsoft) instead.
Good
If you had spelled Tiananmen correctly it would have been censored.
That’s possible…though it did know what I was asking.
@@CrosstalkSolutions You figured out how to bypass the censorship.. just one letter off from the official spelling gets the uncensored view
What about Ollama?
Didn't watch the video...
Sorry, too fast 😮
lol, no
The ugly keyboard though
Love your videos bro! Keep teaching!!!! Appreciate you.