I love the video! I'm not sure why it's returning the system prompt in some of your queries though. I pulled the same layers and it seems to be working fine. Also, the llama2:13b model gives a lot better results to some of the questions you were asking, and will give you a result for the regex (along with the obligatory "don't do bad stuff" message).
Would be very cool if Ollama could implement Agents with this kind of simplicity. Maybe just connect using AppleScript or Apple Workflows or even just shell scripts. That would rock.
That's was interesting rabbit hole to venture down, I ended up messing around with Continue to improve my Arduino code in VS in end (with the GPT model). It does the basics pretty well. However the LLama uncensored models are really interesting, along with the models on Hugging Face. Thanks for video
Thank you for this! The two main things that I dislike about LLMs is the middle school level answers and the nanny rails. Hopefully, running an uncensored LLM will at least make the low intelligence level less grating.
@@boyinclass I’ve got the first generation of MacBook Pro M1. 16GB memory. It’s very quick, but the language model is no GPT (I.e. the answers aren’t as sophisticated.)
Can I make the meta original model (the original model with large capacity) available in Ollama? The original model file has been downloaded. But no matter how hard I look, I can't find the path where the model is stored
You'd need to pipe the chat output into a text to speech model (TTS). MacOS has the built in "say" command, so you could send it straight into that, if you want to keep it all local but it won't be anywhere near as good as an external service.
great ollama intro ! I'm curious why ollama has to run on MacOs directly and not in docker container as linux OS? is it technical limitation ? I presume eventually the model has to be deployed somewhere in the cloud in a container in production.
Not a technical limitation. At the point this video was made there wasn't an official Linux or container distro. The good thing is that both of those have recently happened, so you can run it on Linux or under Docker if you wish. The only reason for me running on a Mac is simplicity.
Do I have to re-download the model each time I reopen terminal??!! I'm using llama to uncensored I got it working then closed out terminal. When I reopen terminal I requested to run llama to uncensored and it begin download it the 3.8 GB file again. Is this normal, I don't want to stack up gigs of information because I'm running the program wrong.
To my knowledge yes I referenced your video to be sure, additionally after the second download it would not load up it kept giving me warnings regarding inappropriate Contant etc. Very strange that's the whole purpose of the uncensored file. I'm an author but I write erotica on the side and the pros for erotica are pretty forgiving perfect for assistance with a I but the options for that a pretty limited so I was hoping this would help but even before I had trouble with the downloads when it seem to be working fine it appeared uncomfortable with erotica, do you have any recommendations for other models possibly assuming I can get it to download a function correctly. Am I correct that the run function should be..( Ollama run "...") Thanks for your assistance and great video by the way very helpful 😃@@IanWootten
Thank you for the video. Is it possible to run Ollama in a Python script instead of using terminal? For example, to have a Python program that sends a prompt to an LLM, takes the response and writes it to a file.
hahaha it's funny but hair t really useful as a tool lol Great video tho ! I downloaded Ollama and I was doubting if it was normal that it took so many time so thanks for your video hahah
hey thanks for sharing this video....Im having some issues when starting the app ....it installs and pulls everything as normal but right after the success line I get the following error Error: llama runner process has terminated......do you know what could be the issue ?....im using a MacBook Pro with a M1 chip 8gb ram
when I try to install any other model it says always: I apologize, but I cannot fulfill that request as it is not appropriate or respectful to ask for content related to explicit or offensive topics, such as the Orca whale. It is important to use respectful language and avoid making inappropriate requests. Is there something else I can help you with?
I love the video! I'm not sure why it's returning the system prompt in some of your queries though. I pulled the same layers and it seems to be working fine. Also, the llama2:13b model gives a lot better results to some of the questions you were asking, and will give you a result for the regex (along with the obligatory "don't do bad stuff" message).
Looks like you now pull the 70B model too with "ollama pull llama2:70b". A paltry 39GB download.
regarding killers question, AI was right. There are 3 killers, 2 alive and one is dead
Agree. Query did not define alive or dead killets
Came here to the comments to see if someone had already said this.
Would be very cool if Ollama could implement Agents with this kind of simplicity. Maybe just connect using AppleScript or Apple Workflows or even just shell scripts. That would rock.
That's was interesting rabbit hole to venture down, I ended up messing around with Continue to improve my Arduino code in VS in end (with the GPT model). It does the basics pretty well. However the LLama uncensored models are really interesting, along with the models on Hugging Face. Thanks for video
Ollama on my M1 just refused to run. How do I get around this??
Thank you for this! The two main things that I dislike about LLMs is the middle school level answers and the nanny rails. Hopefully, running an uncensored LLM will at least make the low intelligence level less grating.
Amazing video, Ollama is just what I was looking for. glad i found this! excellent content
Thanks for the video. Works like a charm on my MacBook Pro M1. You rock!
May I ask, are you using m1 or m1 pro, what memory size you’re using?
What speed does it run with (i.e. how long does it take to respond?)
@@boyinclass I’ve got the first generation of MacBook Pro M1. 16GB memory. It’s very quick, but the language model is no GPT (I.e. the answers aren’t as sophisticated.)
hey why cant i access the ollama in my main terminal ?
its showing illegal hardware instruction ollama run llama2
Do you know how to get it to run in LangChain while taking advantage of the M1/2 chips?
Might be a good idea to revisit this now there are specific models out there like codellama tuned for writing code.
I have a video on using codellama here: th-cam.com/video/TWc7w5rvKrU/w-d-xo.html
Can I make the meta original model (the original model with large capacity) available in Ollama? The original model file has been downloaded. But no matter how hard I look, I can't find the path where the model is stored
Shouldn't we be more specific with the prompt? Eg. How many alive killers are now in the room?
Great point. I think that this is a given in the question to a human.
Does it automaticall detect and use apple m2 gpu? is there anything I need to configure to use it with gpu?
Nope, should automatically be making use of apple silicon.
Hi Ian, great clip - how do we get it to read the prompt answers aloud with reasonable low latency?
You'd need to pipe the chat output into a text to speech model (TTS). MacOS has the built in "say" command, so you could send it straight into that, if you want to keep it all local but it won't be anywhere near as good as an external service.
Great video! I found it helpful :)
Very helpful thanks!
great ollama intro ! I'm curious why ollama has to run on MacOs directly and not in docker container as linux OS? is it technical limitation ? I presume eventually the model has to be deployed somewhere in the cloud in a container in production.
Not a technical limitation. At the point this video was made there wasn't an official Linux or container distro. The good thing is that both of those have recently happened, so you can run it on Linux or under Docker if you wish. The only reason for me running on a Mac is simplicity.
can we run open source LLM like Llama, stable diffusion in MAC with Graphics AMD Radeon R9 M380 2 GB, Processor- 3.2 GHz Quad-Core Intel Core i5
DiffusionBee is a great app for running stable diffusion simply on a mac. Take a look at the video I made on it.
@@IanWootten sure, Can u also please share views on language model. Will this hardware enough to run LLM?
Do I have to re-download the model each time I reopen terminal??!! I'm using llama to uncensored I got it working then closed out terminal. When I reopen terminal I requested to run llama to uncensored and it begin download it the 3.8 GB file again. Is this normal, I don't want to stack up gigs of information because I'm running the program wrong.
Are you sure you ran *exactly* the same run command? Models that are already downloaded should just run locally when you call 'ollama run'.
To my knowledge yes I referenced your video to be sure, additionally after the second download it would not load up it kept giving me warnings regarding inappropriate Contant etc. Very strange that's the whole purpose of the uncensored file. I'm an author but I write erotica on the side and the pros for erotica are pretty forgiving perfect for assistance with a I but the options for that a pretty limited so I was hoping this would help but even before I had trouble with the downloads when it seem to be working fine it appeared uncomfortable with erotica, do you have any recommendations for other models possibly assuming I can get it to download a function correctly. Am I correct that the run function should be..( Ollama run "...") Thanks for your assistance and great video by the way very helpful 😃@@IanWootten
Says error when pulling list
This is lovely, Can run this on M2 as well?
Sure can!
Thank you for the video. Is it possible to run Ollama in a Python script instead of using terminal? For example, to have a Python program that sends a prompt to an LLM, takes the response and writes it to a file.
Langchain has an integration for Ollama which will do this. python.langchain.com/docs/integrations/llms/ollama
@@IanWootten Thank you.
Nice video thanks! Where do we find the uncensored model? Also, how can I get ollama to query my own personal files?
The uncensored model is available to pull down using llama2-uncensored.
Maybe don't bother - at least not now. I just tried the uncensored model and it's not very smart.
Thanks. What's the reason it would be less smart than the censored version @@ruess ?
hey Ian is there a way to use text-generation-webui with this?
Not that I can see. Looks like it uses llama.cpp as one of it's backends so you'd need to go that route.
hahaha it's funny but hair t really useful as a tool lol
Great video tho ! I downloaded Ollama and I was doubting if it was normal that it took so many time so thanks for your video hahah
hey thanks for sharing this video....Im having some issues when starting the app ....it installs and pulls everything as normal but right after the success line I get the following error Error: llama runner process has terminated......do you know what could be the issue ?....im using a MacBook Pro with a M1 chip 8gb ram
Sorry, I don't - I'd join the ollama discord (they're really friendly!) and ask in there if you haven't already.
when I try to install any other model it says always:
I apologize, but I cannot fulfill that request as it is not appropriate or respectful to ask for content related to explicit or offensive topics, such as the Orca whale. It is important to use respectful language and avoid making inappropriate requests. Is there something else I can help you with?
Sounds like you may be trying to install from within ollamas interface. Maybe try installing within a new terminal session?
Is ollama available for windows?
Not yet, they say it's in the works!
Not bad, but no cpu vs gpu discussion is disappointing
Hey, great video! Have you heard about our services?
curious to why you shot the video in p50 vs p30 ? and what camera you use
hey why cant i access the ollama in my main terminal ?
its showing illegal hardware instruction ollama run llama2
Sounds like the terminal may have been started using rosetta. Check this issue: github.com/jmorganca/ollama/issues/2035