Much appreciaated! Installing the 11B on my PC now, but can you make a video on how to get the 1B (or 3B, not sure if my phone is beefy enough for 3B) model run on Andoid?
Excellent tutorial video! I've been thinking about trying out local AI for quite some time but I never got around to it. This made it really simple and hassle free to get it up and running. Thank you! You've earned yourself a new subscriber.
I came for the "Vision" part of the title only to be told it's not available yet on groq - Playing with the python code on the model card and it'll read text from images but about any question about the image just gets a safety warning about cannot id ppl :) Even asking about the Rabbit in their example : "what is this animal and what is it thinking? I'm not able to provide information about the individual in this photo. I can give you an idea of the image's style, but not who's in it. I can provide some background information, but not names. The image is not intended to reveal sensitive information. The image is not intended to reveal personal information. The image is not intended to reveal personal information. The image is not intended to reveal personal information. The image is not intended to reveal personal information. The image is not intended to"
Installing Docker took 3 times as long as installing ollama. Installing this on Windows is different than what you show. On Windows 10, you don't have to install Llama 3.1 then 3.2. Just install 3.2. Also, after docker installs it gives a button that says "Close Restart". I thought it meant close the app and restart it....Noooooooooo, it meant restart Windows...so just be prepared. It's working great for me. Thank you.
Installed, and have been actively using it. I find it really fast and competent in relation to its answers. This is running on a MacBook Air m3 +16gb. I am now training the model on specific papers so I can use it as a medical repository which appears to be going well.
Hi, I am an new developer from India where GPU hardware is a bigger bottle neck for developers! Therefore, please give the minimum GPU or CPU requirement while starting your next youtube video and thanks for sharing such nice video in a straight forward manner \
wow, amazing bro, you saved me time and money. if my damn internet wasn't so slow i probably could have do the install in real time. was able to get it up and running on windows 11 in no time. did have reboot after docker install. but all works well, also windows firewall locally had to be disabled. ANyhow it worked great thank you! can you do a video on how to train this model????
Is there a multiline output box available in Gooey? I know we can generate an input multiline textarea, but I'd like to find an alternative to just printing to the default output box.
Just got this all up an running. I have an older laptop with a low end Ryzen 5 3500U (4 core) with Radeon Vega gpu, 12 gb total memory (shares with GPU). The llama 1b and 3b will work. In terminal mode and through WebUI the 1b works well. 3b and Deepseek 1.5 are kind of slow. *** one major issue that I have found is that when I use WebUI, I can ask a question, it gives a response but then the Ollama server continues to be up and running with 75% CPU utilization (0 GPU utilization) for upward of a minute or more after everything is finished. Trying to ask another question during that time is slower than molasses. I basically have to sit there for a minute or more and wait for the server run from the previous question to wind down (1 minute + after completing the initial response) before it starts on the 2nd response. This is true for both llama 1b and 3b as well as Deepseek R1 1.5. I don't have that issue when I run in a windows terminal......after the Ai response the CPU utilization drops to 5% immediately and another question can be asked.
Thanks for the excellent video. I was able to walk thru all the steps, but the llam models do not show up in the webui. HOw do I add the llam instslled models to webui chat interface
Thanks fo the vid. I want to make my app server to connect to Llama. And I tried making requests to the localhost where Llama is hosted on docker but I am getting Method not allowed. Do you know how to do it?
I installed Lama 3.2 and it is running perfect. Right now, I decided to do an upgrade, and I downloaded Llama 3.3. How do I make sure that Docker is going to except the Llama 3.3. Do I need to put another code in Command prompt to have the new llama 3.3 working perfect? Can you give me some advice?
Not sure about that specific model, but I hope you do realise that you're not running privately when using services like Groq. You can never be 100% sure that your data and interactions with the model are private and not used internally by the service provider or sold. The way I look at it, any business is out to make money, and data is worth quite a bit these days, so if something is free or cheap you should probably wonder if you're not the product that they are making money on, ultimately it'll come down to trust. To ensure running a model privately you simply have to run it locally, but for a model with 90b parameters you would need a very expensive setup, so be prepared to either scale down your expectations to smaller models that fit in your vram, or scale up your budget for a system that can handle large models like that! 🙂
Everything seemed fine until I clicked the link in Docker. The website page opened with an error message stating, "This page isn't working." Can anyone offer assistance?
Hey man amazing video been using llama 3.2 3B on my laptop ever since you posted this, thank you so much! I had a question tho, I am not tech savvy at all, a pop up to update openwebui appeared and I downloaded the zip but no idea how to update it... any help would be appreciated if not it's OK, ill just keep running this old version. Thank you
I would argue that most people purchase cars via a subscription, in the UK we call it PCP but basically its just renting the car with such a high cost at end of term, noone does it.
Which version should I download if I just have a standard Dell laptop running windows and no intent to use the vision features? I don’t want to overwhelm my laptop but look for good performance
Thanks. This worked. However the actual model is very disappointing. A quick 10 minute use of it convinced me that it is pretty worthless. The number of hallucinations was off the scale. Also, the rather daft need for this ridiculous sequence to even run it is bizarre. You would think it would just download and run. Not a patch on ChatGPT or Claude. Not even close.
Hey skill , very good video ! I was wondering if I can help you with more Quality Editing in your videos and make Highly Engaging Thumbnails which will help your videos to get more views and engagement . Please let me know what do you think ?
@@SkillLeapAI i am thinking for uploading legal/court case citations/legislation/regulations and the like. Can I ask what would be the minimum spec requirements for a pc and be capable enough ? Thanks
Who's installing Llama 3.2?
Much appreciaated! Installing the 11B on my PC now, but can you make a video on how to get the 1B (or 3B, not sure if my phone is beefy enough for 3B) model run on Andoid?
download button not showing up
@@singingshelf834 I just learned that EU and some other countries are left outside. Will try with a VPN later...
I installed 3B of 3.2 localy with Openwebui
I got excited and downloaded 450B model 😅
Excellent tutorial video! I've been thinking about trying out local AI for quite some time but I never got around to it. This made it really simple and hassle free to get it up and running. Thank you! You've earned yourself a new subscriber.
I came for the "Vision" part of the title only to be told it's not available yet on groq - Playing with the python code on the model card and it'll read text from images but about any question about the image just gets a safety warning about cannot id ppl :) Even asking about the Rabbit in their example : "what is this animal and what is it thinking? I'm not able to provide information about the individual in this photo. I can give you an idea of the image's style, but not who's in it. I can provide some background information, but not names. The image is not intended to reveal sensitive information. The image is not intended to reveal personal information. The image is not intended to reveal personal information. The image is not intended to reveal personal information. The image is not intended to reveal personal information. The image is not intended to"
The real cost of censored models is dumbing down the model like that
@@pmarreck switch to the 'instruct' version and it works much better.
Installing Docker took 3 times as long as installing ollama. Installing this on Windows is different than what you show. On Windows 10, you don't have to install Llama 3.1 then 3.2. Just install 3.2. Also, after docker installs it gives a button that says "Close Restart". I thought it meant close the app and restart it....Noooooooooo, it meant restart Windows...so just be prepared. It's working great for me. Thank you.
Thanks for the info!
You are funny
Installed, and have been actively using it. I find it really fast and competent in relation to its answers. This is running on a MacBook Air m3 +16gb. I am now training the model on specific papers so I can use it as a medical repository which appears to be going well.
how can you train the model on your own documents? :) Interested in doing the same!
How do you train your model?
Came here for Vision, since it's in the title. Left with no vision.
Thank you so much for this video. It was definitely a value-adding video!!!
Thanks so much been trying to get this to work for like 25 minutes and finally landed on your video
I’m giving it a go! Thanks for the video.
Welcome
That was easy! I already had Docker. Everything turned out perfectly 3B / 1B text. 📐
Hi, I am an new developer from India where GPU hardware is a bigger bottle neck for developers! Therefore, please give the minimum GPU or CPU requirement while starting your next youtube video and thanks for sharing such nice video in a straight forward manner
\
Your explainatiion is just awesome my friend.
wow, amazing bro, you saved me time and money. if my damn internet wasn't so slow i probably could have do the install in real time. was able to get it up and running on windows 11 in no time. did have reboot after docker install. but all works well, also windows firewall locally had to be disabled. ANyhow it worked great thank you! can you do a video on how to train this model????
Excellent videos .. Thank you very much Sir.
Title is : Meta's New Llama 3.2 with Vision is here - Run it Privately on your Computer. Are you sure?
Can you explain in the next video why choose Ollama versus FaceHug?
LLAMA became my best friend after gpt went all corpo cnt on me.
What do you mean? Never used llama so what’s the difference
@@RememberTheLord freeeee opeeen souuuurcee kaching kaching for your broke a s
thanks so much i be testing this out today on a rpi 5 :d
Great video and so simple. Some guge had me running Ubuntu and all sorts. I gave up in the end and I'm pretty IT saavy. This was a doddle!
Is there a multiline output box available in Gooey? I know we can generate an input multiline textarea, but I'd like to find an alternative to just printing to the default output box.
Thanks for the tip. I'll dig in to it a bit more but I don't know a way to get multiple text areas as an output
@@SkillLeapAI Ok, thanks SL.
Mera sponsoring this video 😂
how do i install and run the vision models? I have access already
Just got this all up an running. I have an older laptop with a low end Ryzen 5 3500U (4 core) with Radeon Vega gpu, 12 gb total memory (shares with GPU).
The llama 1b and 3b will work. In terminal mode and through WebUI the 1b works well. 3b and Deepseek 1.5 are kind of slow.
*** one major issue that I have found is that when I use WebUI, I can ask a question, it gives a response but then the Ollama server continues to be up and running with 75% CPU utilization (0 GPU utilization) for upward of a minute or more after everything is finished. Trying to ask another question during that time is slower than molasses. I basically have to sit there for a minute or more and wait for the server run from the previous question to wind down (1 minute + after completing the initial response) before it starts on the 2nd response. This is true for both llama 1b and 3b as well as Deepseek R1 1.5.
I don't have that issue when I run in a windows terminal......after the Ai response the CPU utilization drops to 5% immediately and another question can be asked.
Excellent content and commentary!
Thanks for the insights
Very cool! What are your computer specs? In other words, what do I need to get that speed locally? What are minimum specs to run Llama 3.2?
he 11B Vision model takes about 10 GB of GPU RAM
I just hope the 90b is Amazing & can output over 2k words & Code
Thanks for the excellent video. I was able to walk thru all the steps, but the llam models do not show up in the webui. HOw do I add the llam instslled models to webui chat interface
Can you run it on aws or azure?
Thanks fo the vid.
I want to make my app server to connect to Llama. And I tried making requests to the localhost where Llama is hosted on docker but I am getting Method not allowed. Do you know how to do it?
Was LLM Studio not an option in September? LLM Studio is way easier with ZERO terminal necessary. You can also run LLM Farm iOS app on your Mac.
It doesn't have lot of the functionality of what I showed. You can do a lot with OpenWebUI
I installed Lama 3.2 and it is running perfect. Right now, I decided to do an upgrade, and I downloaded Llama 3.3. How do I make sure that Docker is going to except the Llama 3.3. Do I need to put another code in Command prompt to have the new llama 3.3 working perfect? Can you give me some advice?
How to run privately 90b on Groq cloud … Also what’s the point of the demo when the multimodal is still not available
Not sure about that specific model, but I hope you do realise that you're not running privately when using services like Groq. You can never be 100% sure that your data and interactions with the model are private and not used internally by the service provider or sold. The way I look at it, any business is out to make money, and data is worth quite a bit these days, so if something is free or cheap you should probably wonder if you're not the product that they are making money on, ultimately it'll come down to trust.
To ensure running a model privately you simply have to run it locally, but for a model with 90b parameters you would need a very expensive setup, so be prepared to either scale down your expectations to smaller models that fit in your vram, or scale up your budget for a system that can handle large models like that! 🙂
can install without a graphic card? Tks
hi, i am unable to generate image through webui. currently installed 3.2, 3.1 and 3.0
can llama 3.2 be integrated in a website as a chat bot?
Same i want to know this
So you can run a model with 64gb ram on a recent windows computer?
tried this but the Docker isn't showing the openwebui after its finished loading...😑
Why not tell us how much vram is needed for these models??
Everything seemed fine until I clicked the link in Docker. The website page opened with an error message stating, "This page isn't working." Can anyone offer assistance?
I have same issue, someone has any solution?
There is option in docker under settings/resources/network to enable host networking. Enabling that setting worked for me.
Can you make a video for running deep learning model locally on mac
Hey man amazing video been using llama 3.2 3B on my laptop ever since you posted this, thank you so much! I had a question tho, I am not tech savvy at all, a pop up to update openwebui appeared and I downloaded the zip but no idea how to update it... any help would be appreciated if not it's OK, ill just keep running this old version. Thank you
On windows the terminal is called a dos prompt
Windows 11 has something called windows terminal
Great video, how do you install the larger models?
I would argue that most people purchase cars via a subscription, in the UK we call it PCP but basically its just renting the car with such a high cost at end of term, noone does it.
Thank you!
Which version should I download if I just have a standard Dell laptop running windows and no intent to use the vision features? I don’t want to overwhelm my laptop but look for good performance
Thanks. This worked. However the actual model is very disappointing. A quick 10 minute use of it convinced me that it is pretty worthless. The number of hallucinations was off the scale. Also, the rather daft need for this ridiculous sequence to even run it is bizarre. You would think it would just download and run. Not a patch on ChatGPT or Claude. Not even close.
It's because we only have access to the 1B or 3B models. I just tried the 70B on groq and it's MUCH better. But still not as good as those /shrug
should i install llama 3.1 before 3.2? can i download the new model from the start?
Will it work with CPU?
Probably not or it will be extremely slow without a good GPU
i have 3.1 with this process . how do you just update the model from 3.1 to 3.2
Question: hosting the local AI but giving access to family (with their own user account) will this give them access to my own uploaded content ?
Is it really private or does it send info in the background? (Or when back online)
where to get the llama 3.2 with vision capabilities?
Is this one uncensored tho?
I tried this in my windows machine...its very slow. !!!
Failed miserably on the classic question "How many words are your answer to this question?"
at 2 mins 30 you suddenly get a pop-up window appear and you selected move to applications, how did you get that?
Yeah that confused me too. Run Ollama the program and that'll open but you don't need it. Just punch in the command he gives you just after that.
Thank you....
nice tutorial
How can we run this on mobile phone?
Is there any API key for ollama models?
For python
Not that I know of
idk why does my webUI run so slow
Please suggest any text to video converter model
Runway
How can expose api
Hey skill , very good video ! I was wondering if I can help you with more Quality Editing in your videos and make Highly Engaging Thumbnails which will help your videos to get more views and engagement . Please let me know what do you think ?
Last time I installed Llama3 I burned my hard drive up.
These new smaller models should perform a lot better
@@SkillLeapAI i am thinking for uploading legal/court case citations/legislation/regulations and the like. Can I ask what would be the minimum spec requirements for a pc and be capable enough ? Thanks
What you show there is different from reality, especially when you use a terminal to get a container, I gett stuck there
We want llama 4 o1 model!!!
Stiedemann Shores
Thank you for the tutorial, but this thing is dumb as a rock compared to Chat GPT 4.0 so I probably won't find much use for it.
Even me i was challenged to configure docker even webUI , any one who did in window11,can help to finish that
"The world is now a reality" lol.
interesting, but i stopped watching around 3 min because of the tiny terminal screen that you used to show what you were doing.
Why do we need Lama????? I will wait until they make it easy to install without any other docker, links ... etc....
Privacy. If you don’t care about that, you can just use it on groq or meta Ai
I used LM Studio with Llama3 , it is easier
Like how it still thinks it’s connected to meta servers
Kertzmann Court
Yundt Springs
Ask it to write code for gta 6
Lol
so a total BS Clickbait title! Next time I see a Skill Leap AI video I am ignoring it
AI generated video title.
clickbait ;(
How can I install locally the 11B model ?