Ollama: The Easiest Way to Run Uncensored Llama 2 on a Mac

แชร์
ฝัง
  • เผยแพร่เมื่อ 16 พ.ย. 2024

ความคิดเห็น • 59

  • @agntdrake
    @agntdrake ปีที่แล้ว +3

    I love the video! I'm not sure why it's returning the system prompt in some of your queries though. I pulled the same layers and it seems to be working fine. Also, the llama2:13b model gives a lot better results to some of the questions you were asking, and will give you a result for the regex (along with the obligatory "don't do bad stuff" message).

  • @IanWootten
    @IanWootten  ปีที่แล้ว +3

    Looks like you now pull the 70B model too with "ollama pull llama2:70b". A paltry 39GB download.

  • @АртемФедоров-ю7б
    @АртемФедоров-ю7б 9 หลายเดือนก่อน +6

    regarding killers question, AI was right. There are 3 killers, 2 alive and one is dead

    • @nickcarter3257
      @nickcarter3257 9 หลายเดือนก่อน +2

      Agree. Query did not define alive or dead killets

    • @YKSGuy
      @YKSGuy หลายเดือนก่อน

      Came here to the comments to see if someone had already said this.

  • @satkinsonfreshobject
    @satkinsonfreshobject ปีที่แล้ว

    Would be very cool if Ollama could implement Agents with this kind of simplicity. Maybe just connect using AppleScript or Apple Workflows or even just shell scripts. That would rock.

  • @GavinLyonsCreates
    @GavinLyonsCreates ปีที่แล้ว

    That's was interesting rabbit hole to venture down, I ended up messing around with Continue to improve my Arduino code in VS in end (with the GPT model). It does the basics pretty well. However the LLama uncensored models are really interesting, along with the models on Hugging Face. Thanks for video

  • @dlprod11
    @dlprod11 หลายเดือนก่อน

    Ollama on my M1 just refused to run. How do I get around this??

  • @ftlbaby
    @ftlbaby 6 หลายเดือนก่อน

    Thank you for this! The two main things that I dislike about LLMs is the middle school level answers and the nanny rails. Hopefully, running an uncensored LLM will at least make the low intelligence level less grating.

  • @LokoForPopoCuffs
    @LokoForPopoCuffs ปีที่แล้ว

    Amazing video, Ollama is just what I was looking for. glad i found this! excellent content

  • @robdoubleyou4918
    @robdoubleyou4918 11 หลายเดือนก่อน

    Thanks for the video. Works like a charm on my MacBook Pro M1. You rock!

    • @boyinclass
      @boyinclass 8 หลายเดือนก่อน

      May I ask, are you using m1 or m1 pro, what memory size you’re using?
      What speed does it run with (i.e. how long does it take to respond?)

    • @robdoubleyou4918
      @robdoubleyou4918 8 หลายเดือนก่อน

      @@boyinclass I’ve got the first generation of MacBook Pro M1. 16GB memory. It’s very quick, but the language model is no GPT (I.e. the answers aren’t as sophisticated.)

  • @amanreddypundru8933
    @amanreddypundru8933 10 หลายเดือนก่อน

    hey why cant i access the ollama in my main terminal ?
    its showing illegal hardware instruction ollama run llama2

  • @jzam5426
    @jzam5426 6 หลายเดือนก่อน

    Do you know how to get it to run in LangChain while taking advantage of the M1/2 chips?

  • @shuntera
    @shuntera 10 หลายเดือนก่อน +1

    Might be a good idea to revisit this now there are specific models out there like codellama tuned for writing code.

    • @IanWootten
      @IanWootten  10 หลายเดือนก่อน

      I have a video on using codellama here: th-cam.com/video/TWc7w5rvKrU/w-d-xo.html

  • @노르웨이연어고양이
    @노르웨이연어고양이 ปีที่แล้ว

    Can I make the meta original model (the original model with large capacity) available in Ollama? The original model file has been downloaded. But no matter how hard I look, I can't find the path where the model is stored

  • @greenageguy
    @greenageguy ปีที่แล้ว +1

    Shouldn't we be more specific with the prompt? Eg. How many alive killers are now in the room?

    • @IanWootten
      @IanWootten  ปีที่แล้ว

      Great point. I think that this is a given in the question to a human.

  • @mehmetbakideniz
    @mehmetbakideniz 6 หลายเดือนก่อน

    Does it automaticall detect and use apple m2 gpu? is there anything I need to configure to use it with gpu?

    • @IanWootten
      @IanWootten  6 หลายเดือนก่อน

      Nope, should automatically be making use of apple silicon.

  • @exploratoria
    @exploratoria 5 หลายเดือนก่อน

    Hi Ian, great clip - how do we get it to read the prompt answers aloud with reasonable low latency?

    • @IanWootten
      @IanWootten  5 หลายเดือนก่อน

      You'd need to pipe the chat output into a text to speech model (TTS). MacOS has the built in "say" command, so you could send it straight into that, if you want to keep it all local but it won't be anywhere near as good as an external service.

  • @masmadp9612
    @masmadp9612 ปีที่แล้ว +1

    Great video! I found it helpful :)

  • @josephkaisner4581
    @josephkaisner4581 6 หลายเดือนก่อน

    Very helpful thanks!

  • @NitroBrewbell
    @NitroBrewbell ปีที่แล้ว

    great ollama intro ! I'm curious why ollama has to run on MacOs directly and not in docker container as linux OS? is it technical limitation ? I presume eventually the model has to be deployed somewhere in the cloud in a container in production.

    • @IanWootten
      @IanWootten  ปีที่แล้ว +1

      Not a technical limitation. At the point this video was made there wasn't an official Linux or container distro. The good thing is that both of those have recently happened, so you can run it on Linux or under Docker if you wish. The only reason for me running on a Mac is simplicity.

  • @ggopi767
    @ggopi767 ปีที่แล้ว

    can we run open source LLM like Llama, stable diffusion in MAC with Graphics AMD Radeon R9 M380 2 GB, Processor- 3.2 GHz Quad-Core Intel Core i5

    • @IanWootten
      @IanWootten  ปีที่แล้ว +1

      DiffusionBee is a great app for running stable diffusion simply on a mac. Take a look at the video I made on it.

    • @ggopi767
      @ggopi767 ปีที่แล้ว

      @@IanWootten sure, Can u also please share views on language model. Will this hardware enough to run LLM?

  • @jaredmartin8760
    @jaredmartin8760 ปีที่แล้ว

    Do I have to re-download the model each time I reopen terminal??!! I'm using llama to uncensored I got it working then closed out terminal. When I reopen terminal I requested to run llama to uncensored and it begin download it the 3.8 GB file again. Is this normal, I don't want to stack up gigs of information because I'm running the program wrong.

    • @IanWootten
      @IanWootten  ปีที่แล้ว

      Are you sure you ran *exactly* the same run command? Models that are already downloaded should just run locally when you call 'ollama run'.

    • @jaredmartin8760
      @jaredmartin8760 ปีที่แล้ว

      To my knowledge yes I referenced your video to be sure, additionally after the second download it would not load up it kept giving me warnings regarding inappropriate Contant etc. Very strange that's the whole purpose of the uncensored file. I'm an author but I write erotica on the side and the pros for erotica are pretty forgiving perfect for assistance with a I but the options for that a pretty limited so I was hoping this would help but even before I had trouble with the downloads when it seem to be working fine it appeared uncomfortable with erotica, do you have any recommendations for other models possibly assuming I can get it to download a function correctly. Am I correct that the run function should be..( Ollama run "...") Thanks for your assistance and great video by the way very helpful 😃@@IanWootten

  • @northerncaliking1772
    @northerncaliking1772 5 หลายเดือนก่อน

    Says error when pulling list

  • @sanjayarupasinghe9673
    @sanjayarupasinghe9673 11 หลายเดือนก่อน

    This is lovely, Can run this on M2 as well?

    • @IanWootten
      @IanWootten  11 หลายเดือนก่อน

      Sure can!

  • @jff711
    @jff711 ปีที่แล้ว

    Thank you for the video. Is it possible to run Ollama in a Python script instead of using terminal? For example, to have a Python program that sends a prompt to an LLM, takes the response and writes it to a file.

    • @IanWootten
      @IanWootten  ปีที่แล้ว +2

      Langchain has an integration for Ollama which will do this. python.langchain.com/docs/integrations/llms/ollama

    • @jff711
      @jff711 ปีที่แล้ว

      @@IanWootten Thank you.

  • @zeropain9319
    @zeropain9319 ปีที่แล้ว

    Nice video thanks! Where do we find the uncensored model? Also, how can I get ollama to query my own personal files?

    • @IanWootten
      @IanWootten  ปีที่แล้ว +2

      The uncensored model is available to pull down using llama2-uncensored.

    • @ruess
      @ruess ปีที่แล้ว

      Maybe don't bother - at least not now. I just tried the uncensored model and it's not very smart.

    • @zeropain9319
      @zeropain9319 ปีที่แล้ว

      Thanks. What's the reason it would be less smart than the censored version @@ruess ?

  • @njarecki
    @njarecki ปีที่แล้ว

    hey Ian is there a way to use text-generation-webui with this?

    • @IanWootten
      @IanWootten  ปีที่แล้ว

      Not that I can see. Looks like it uses llama.cpp as one of it's backends so you'd need to go that route.

  • @sitrakaforler8696
    @sitrakaforler8696 ปีที่แล้ว

    hahaha it's funny but hair t really useful as a tool lol
    Great video tho ! I downloaded Ollama and I was doubting if it was normal that it took so many time so thanks for your video hahah

  • @pravinsukumaran6080
    @pravinsukumaran6080 11 หลายเดือนก่อน

    hey thanks for sharing this video....Im having some issues when starting the app ....it installs and pulls everything as normal but right after the success line I get the following error Error: llama runner process has terminated......do you know what could be the issue ?....im using a MacBook Pro with a M1 chip 8gb ram

    • @IanWootten
      @IanWootten  11 หลายเดือนก่อน

      Sorry, I don't - I'd join the ollama discord (they're really friendly!) and ask in there if you haven't already.

  • @mariorobles87
    @mariorobles87 ปีที่แล้ว

    when I try to install any other model it says always:
    I apologize, but I cannot fulfill that request as it is not appropriate or respectful to ask for content related to explicit or offensive topics, such as the Orca whale. It is important to use respectful language and avoid making inappropriate requests. Is there something else I can help you with?

    • @IanWootten
      @IanWootten  ปีที่แล้ว +1

      Sounds like you may be trying to install from within ollamas interface. Maybe try installing within a new terminal session?

  • @Sushmaya1838
    @Sushmaya1838 ปีที่แล้ว

    Is ollama available for windows?

    • @IanWootten
      @IanWootten  ปีที่แล้ว +1

      Not yet, they say it's in the works!

  • @Dave-nz5jf
    @Dave-nz5jf ปีที่แล้ว

    Not bad, but no cpu vs gpu discussion is disappointing

  • @macwebcomputing
    @macwebcomputing ปีที่แล้ว

    Hey, great video! Have you heard about our services?

  • @AI-Consultant
    @AI-Consultant ปีที่แล้ว

    curious to why you shot the video in p50 vs p30 ? and what camera you use

  • @amanreddypundru8933
    @amanreddypundru8933 10 หลายเดือนก่อน

    hey why cant i access the ollama in my main terminal ?
    its showing illegal hardware instruction ollama run llama2

    • @IanWootten
      @IanWootten  10 หลายเดือนก่อน +1

      Sounds like the terminal may have been started using rosetta. Check this issue: github.com/jmorganca/ollama/issues/2035