this open source project has a bright future
ฝัง
- เผยแพร่เมื่อ 1 พ.ย. 2023
- An introduction to Ollama.ai and some of my favorite integrations.
DevOnDuty video on gen.nvim: • Local LLMs in Neovim: ...
Keyboard: Glove80 - www.moergo.com/collections/gl...
Stuff I use (Amazon affiliate links):
Camera: Canon EOS R5 amzn.to/3CCrxzl
Monitor: Dell U4914DW 49in amzn.to/3MJV1jx
SSD for Video Editing: VectoTech Rapid 8TB amzn.to/3hXz9TM
Microphone 1: Rode NT1-A amzn.to/3vWM4gL
Microphone 2: Seinheiser 416 amzn.to/3Fkti60
Microphone Interface: Focusrite Clarett+ 2Pre amzn.to/3J5dy7S
Tripod: JOBY GorillaPod 5K amzn.to/3JaPxMA
Mouse: Razer DeathAdder amzn.to/3J9fYCf
Computer: 2021 Macbook Pro amzn.to/3J7FXtW
Lens 1: Canon RF50mm F 1.2L USM amzn.to/3qeJrX6
Lens 2: Canon RF24mm F1.8 Macro is STM Lens amzn.to/3UUs1bB
Caffeine: High Brew Cold Brew Coffee amzn.to/3hXyx0q
More Caffeine: Monster Energy Juice, Pipeline Punch amzn.to/3Czmfox
Building A Second Brain book: amzn.to/3cIShWf - วิทยาศาสตร์และเทคโนโลยี
Jesus, i downloaded codellama model and it's SOOOO QUICK. I tried using some of the older models, and my M1 Pro was basically dying while calculating a response, but this model is even faster than ChatGPT!
Thanks so much for the video!
Finally! Exactly what I've wanted to start playing with locally.
All I can say is "Wow". I think you're spot on. This is the most important project of the year.
thanks! I knew I couldn't be the only one thinking this 😎
almost didn't click on this video, but really glad I did. more people should know about this utility, this makes using LLMs so much easier for the vast majority of people.
nice, really happy you got something out of the video!
I really enjoy the dry humour throughout all your videos. :)
Hey, thank you for featuring my plugin ♥, I'm a big fan of your channel!
Thanks for making it David! And also for the video - I believe it was how I discovered Ollama
Awesome video. Thanks for making it. I’m one of the maintainers of the project and it’s been a blast to make.
thank you and thank you for your work on Ollama! Truly an impressive project 😎
Someone else commented that the url to the repo is not in the description@@codetothemoon
This is a game-changer tool - it makes it SO easy to quickly get started, and I'm already able to use it to generate some meaningful results! Excited to explore this more and try and see how to integrate it into my everyday life.
thank you for introducing me this! this seems like the ultimate solution to making your own LLMs without having to run bulky webservers! really promising project and will definitely give it a go (pun intended)
That's amazing. Thank you for making this video! It's great to see some nice tooling around these AI models. 😃
thanks for watching! 😎
Can't believe how amazing it looks, gonna try it out
nice, hope it helps you in some way!
Excellent overview! Thanks so much.
thanks, glad you found it valuable!
I never thought about it but putting a LLM or any other very large ML model in something like a docker image makes so much sense
I’m playing around with it and it’s been amazing experience. So well thought.
nice! yeah I'm still amazed at how few folks know about it...
This looks really promising! I can just imagine how easy it would be to just "import" an LLM into my code lol
we can do it with api's but the catch is that they are paid (mostly), ollama seems promising 🙂
btw ❤ your channel
agree, having the details of how the language model is loaded / prompted handled by something else is really nice!
@@balloontune1769 yeah I think Ollama is an amazing alternative to paid APIs, I suspect the paid offerings may not last much longer unless they can sustain a substantial performance margin above the open source models
make short now
@@codetothemoon yeah i totally agree 🙌
Wow, thats looks great! Thanks for the update 😊
thanks for watching!
@@codetothemoon gave it a go. It's surprisingly powerful!
Oh man, that code review feature is amazing. I definitely want to try that.
agree - it seems like something that might be useful to most developers
The "It isn't written in rust, sorry about that" at 0:44 made my actually laugh out audibly, thank you for that LMAO
I'll deffinately be checking this out. Working on a game engine, and I rely heavily on Phind for help with it. I want to make my own ai assistant like Phind and incorporate it into the engine, so I and other users, if any, can use it without having to switch tabs or remind Phind of critical information.
Is Phind helpful? Compared to other coding assistants like Codieum or just ChatGPT? Is there any AI assistant out that that can fully integrate your large codebases into its context and evolve as more code/files are put into that codebase?
@@artoras6098 Haven't heard of Codieum before, but yes. I would probably be at least a year behind where I am in making my game engine without it. Phind can search for the correct information, remember things you've talked about, and gives answers in a tutorial-style format. It's gamechanging.
This has given me the idea to try having Neovim with one of these models running to let me ask questions about open buffers on my new companies M3 Max.
No more evenings spent installing thousands of tools in different versions, manually trying to find download buttons on each website? Sounds like heaven!
right? 😎
Very cool its going to be interesting in the next couple of years with LLMs like ollama. Are there any other rust LLMs you recommend that look promising (eg burn)?
I didnt even know i could host ai model on my local machine in a first place, i thought it would be too resource incentive
Yeah many of the higher performing models were a bit too large to be feasibly run on most personal computers, but in the last few months there have been significant advances in the performance of models small enough to run on most modern PCs. It’s an exciting time 😎
Training is cost intensive. If you have a trained model and are fine with nit having the ability to correct mistakes (example, you show it a dog, it says cat, you can't "tell it" that it's a dog), then it's basically a bunch of multiplications.
Now it's doable albeit with a 4090 for example with colossal-ai or like tinygrad.
There are more to come with open sourced models.
Local LLMs don't really need a big datasets. 7 billion parameter dataset with 4 bit "training" is doable locally. Like training on the weekends. Per 48 hours every week.
Because now the "AI field" now is training multi-agents. Two or more LLMs talking to each other and trying to complete a task. Like Microsoft's Autogen.
with some tweaking I got vicuna image generating Ai working on my measly gtx1650 super. Tho I never got a text-based model to work in GPU mode, and in CPU mode it's fairly slow.
iirc llama can be run locally @geerlingguy runs it
“It isn’t written in Rust”. Time for a friendly rewrite
languages will be obsolete in 10 years as AI will write all code :D
@@marktellez3701they also said the post office would be out of business due to the internet.
nice video! I have been playing with open interpreter and I enjoyed it can interact with my local machine, is the ollama project capable of that? IMO a good local assistant should be active in the filesystem, I want it to grind also if I’m not there.
Spent a good while playing around with it and it seems promising.
nice - yeah definitely worth keeping an eye on
Great Vid as always! Thanks for the heads up - gonna check it out :)
thanks! let us know how it goes!
This is an amazing tool
Agree 💯
am I just blind or did you forget the link to the repo in the description?
Love the channel BTW ❤
00:01 This open source project has a bright future.
01:16 OpenAI's open source project, Olama, offers multiple benefits for language model management.
02:34 OL exposes HTTP APIs for application integration
03:52 Obsidian's Olama plugin allows users to extract information from a knowledge base.
05:18 Integrating Obsidian AMA with neovim using gen.envm
06:50 The emac olama integration allows you to use the contents of the current buffer as part of the context for your prompt to the language model.
08:19 The new model is more creative than the default.
09:46 Open source projects like Olama have a bright future
Time line generated by ollama? 😉
this project is the mvp of the year, dangg...
agree 💯
I've not watched most of the video yet, but for some reason this project calls to me.
Answer the call! 🐺
Very informative!
Ollama is the easiest way to run 7B models locally in my laptop.. I wish their Dockerfile approach brings RAG and fine-tuning capabilities soon..
yeah funny you mention fine-tuning - I was hoping for this as well. it seems plausible to have some way of specifying some source of extra training data for fine tuning in the Modelfile - wondering if it's on their roadmap!
What would be the easiest or best approach to create a model based on my Obsidian Vault Markdown files? I'd like it to be trained on top of my notes in addition to the based llama2 model. Any suggestions would be much appreciated. Awesome video, btw.
Just Google for "train ollama model on your own data" - lots of useful stuff right on page one.
One of the easiest in terms of installation and executing. No GUI no problemo. Try this prompt: "what is the square root of two?" using Mistral model. Amazing!!
Great video ❤
Glad you liked it!!
Within working memory, I have never wanted to thumbs up a video more.
Wow! Very cool :O
Thx
thanks for watching! 😎
Fuck yeah dude I'm hype I knew I wasn't contributing to a random script kiddie project; I've been telling everyone I know about how cool Ollama is. The OG devs are actually incredible, so open to feedback, and the project is super friendly for first contributions.
Wow, 👍 and thanks for your contribution..
9:31 gotta love how he accidentally prompted the Angry Rustacean model for "ollama list" and it gave a suiting response
*ahem* i really would like to see lora adapters or fine tuning features in some of these great abstractions. especially for code assistant features to learn local codebases. im not sure if vector databases could compete but fine tuning and some type of git like model tracker to track diffs in fine tuned downstream models would be interesting especially with lora.
that's wonderful. an ai without big tech (therefore privacy)
this is so cool
I thought so too 😎
Nice idea 💡👍
Agree 💯
Man got some balls. He's using VIM!!!
💪
Damn, that is super impressive.
Edit: that response as an arrogant Rust developer floored me 😂
This is fire
agree! 🔥
As soon as you said it's like Docker-izing I was like "oh that makes sense"
makes total sense! got me thinking about what else might benefit from the same pattern...
Can we use it in a production environment for processing inference in parallel?
Good question - I didn't look into parallelization specifically but I'd be shocked if they don't have a story there, either in the current release or on the roadmap
In regards to cancelling chatgpt, what models are as good as gpt4 right now? I'm not very up-to-date on all the ones coming out.
I haven't done any rigorous testing, but in my short experience Mistral 7B seems to give responses of a quality that is quite comparable to GPT4. I believe in all of the formal benchmarks GPT4 is still the leader, but the open source LLMs are getting really, really close and it's not hard to imagine they will surpass it in 1-2 years
@@codetothemoon and I can just go get models like that on hugging face pretty easily?
@@notgate2624 you can but it's even easier with Ollama - just do `ollama run mistral` and it will download the model automatically and drop you into a command line based chat with the model
@@codetothemoon sweeeet
I can't believe it's not written in Rust I'm litearlly shaking and crying rn 😔
Right ? 😂
yes this was a heartbreaking discovery 💔
Heard! 🔥
Waitng for the native windows version. I’m running the WSL2 version locally (and the macOS version in a macbook pro M1, is amazingly fast there) but is a hassle to expose the port to other devices in my network. And also I’m using a vs code plugin called continue that does nearly the same as the vim&emacs plugins mentioned. Very nice video, I didn’t know about obsidian.
Wow. This honestly feels like sci fi. It was only yesterday we were told that LLM's were too heavy to run on standard pc's so this is a massive leap.
I agree, it’s been crazy how quickly this stuff has been evolving
you have no Idea, I've run ollama in my raspberry pi 4b 4gb... It's definitely very slow, but IT IS POSSIBLE... I'm excited to see what the future reserves for us, buddies!
Very promising. Thanks for sharing. Expect them to get $10M series A at $100M valuation. 💵🙄
The llama just always reminds me of Perl because the Oreilly cover was a llama
Wow, this looks promising! Would it be possible to instruct llama with the code of, let's say, a legacy codebase and use it to answer questions about it for newcomers that need onboarding? Am I too much of a dreamer?
It would be very nice. The old team/vendor left with a messy code with outdated documents.
Have the model study and explain it for you.
I think this could be achieved by fine-tuning the model on said codebase using LoRA for example. Without retraining, I don’t know how it would be done
Looks promising
agree 💯
_very soon_
_project has 10k stars_
Do we have a contest that elects the most important open source project of the year?
What keyboard are you using? Otherwise great video!
It looks like a MoErgo Glove 80
There's a link in the video description for them as well
thanks! Keyboard is a Glove80, I actually made a whole video about it th-cam.com/video/PFFa3h7eLWM/w-d-xo.htmlsi=AoAujHfbazkDtanZ
I love the idea of Ollama in my editor. I cannot find a vim version of that plugin though. Anyone know of any?
Its fun 🎉🎉
Didn’t Nomic ai do this a while ago with gpt4all?
I thought LLMs were totally inaccessible to newbies like me, but now I see I was wrong
could i train it to knoow my daily task and do some when am offline or inaticve on my pc
cuz thats a game changer
i really want to try this but here is the issue when running in wsl it does not pickup the gpu no matter what you do follow the instructions to use the cuda drivers nothing it just defaults to the cpu and does not even attempt to choose the gpu and it does not have a windows version yet it just says coming soon. so it is virtually impossible to use on windows at least in my experience
How can I provide my own corpus when building a model?
This thumbnail is really nice, how did you make it?
Thanks! I use Adobe Illustrator to make all of my thumbnails, though I’ve been looking into switching to something else - I think there is an open source equivalent called Ink-something. Also I heard Fireship just uses Figma…
What ide were you running? that looks like vscode but like… sexy ?
Move over! 👀👀
What is that keyboard??
Glove80 - I have a whole video on it! th-cam.com/video/PFFa3h7eLWM/w-d-xo.html
THis is working so great! Are there anyway to get it to use my 3090 GPU tho?
Finally pays off to have 32gb ram xD
this isn't new, I being messing around with local llms for few weeks and ollama was 1 of the first things I saw as a way of working with llms. Am I really confusing it with something else? Didn't it came with llama llm? or around same time?
this is the one that still does not work on windows, nevermind what I said, this is not what I being using lol sorry
How safe is it to download these llms from their site? Please let me know.
I don't think there is any danger in downloading the LLMs themselves. As with any time you are downloading a binary from the internet, your caution should be focused on the Ollama binary itself - but LLMs aren't actually executable code, so there isn't a possibility of malware there.
@@codetothemoon Thank you for the response. I am also hesitant of their app as well, even with using Little Snitch, I am hesitant to to start installation process of entering your admin password. Are there any open source similar apps out there? I was also wondering what LLM would you recommended for a M2 Max macbook with 64gb ram? Is it possible to run 70b models on it? Thanks.
so cool almost like google duet for free
Why aren't you using a terminal in emacs. You could've done your nvim demo in there instead.
LOL yes this was a miss!
cool
😎
does this work off VRAM or RAM? also can it Code?
GPU is not required - it can do inference on the CPU so in that case you'd be constrained by just RAM. And yes it can code, though I haven't found the smaller models to be very good at it. Maybe the larger ones are better.
@@codetothemoon excellent any you can recommend?
Its great to have more open/local options, but I'm quite over LLMs, I always thought the only compelling applications were for funny/ridiculous npc dialogues in games/mods or to be even less able to get a hold of a human for customer support type things... ("ai" bureaucrats truly is the worst dystopia)
Having your internal docs being available through a chat bot instead of having to churn through tones of disparate documents/wiki/readmes is going to be a game changer tbh.
i try ollama in macbook pro m1 2020, its very slow running ollama.
Can i run it on my Thinkpad X260 tho?
not sure of the specs of that machine, but there is probably a language model that will work. the 7B ones are great and will probably work on most modern machines, but if you need to go smaller maybe check out orca-mini
So imagine you create an app and you want to integrate that app. Can you host the you created ?
Guess I need to get a beefy computer to run this locally 😅
not necessarily! some of the more modestly sized models like Mistral 7B are actually relatively performant even without a GPU!
Wait.. yours prompts if you do ollama run without a prompt whereas mine exits immediately.
Snoop dog: just roll with it, life’s too short.
Socrates:: drinks hemlock:: I know
could this be integrated with helix?
there's nothing precluding that on the ollama side, but last I checked Helix didn't have plugin support so if that's still the case it might be the biggest obstacle...
@@codetothemoon Yes, helix still has no plugin support yet. But currently I’m trying to get a solution that works with pipes and the ‘:pipe’ command
@@codetothemoon Got it kinda working with the helix :pipe command, but still not optimal
this is the only way that i found to run llms without my GPU get realy hot.
hahah nice - keep that GPU cool 😎
Why not link the github repo in the description?
i should have! github.com/jmorganca/ollama
where are you looking ?
..even more useful things. You cant compare two fundamentally different things.
What's the metamath model? Does it generate code for the metamath theorem prover? If so that's super cool
to be honest I didn't look into it, I was just downloading random models off of HuggingFace to try out the GGUF import feature 🙃
"It isn't written in Rust". Let's fix that!
*cracks knuckles
that was my first reaction when I looked at the GitHub repo as well.... 🙃
That's very good except for running this things you have too many ram power. For the most basic models you have to have a remote server that has 8gb ram, which will be very expensive or you have to run it on your computer that has same amount of ram.
Microsoft is doing some nice work on this. For example, they created MiniLM on huggingface that only uses a couple of hundred MB. It doesn’t have the 7 billion parameters that other common models do, but it is supposed to perform decently and it’s intended for small systems
It's funny that I just assumed it was written in rust 🦀
yes, was hoping to be able to add a project of this stature to the list of great Rust projects but no such luck 😭
but ooboboga is something similar
Your prompter is too high above the camera and it constantly looks like you're talking to me while looking over my head. I'm scared there's someone behind me 😟
Finally I can have Jarvis
That's chris right?
who is chris?
1:01
32TB/s? Woww
hah yeah didn't even notice this while making the video, must be due to stuff being cached
@@codetothemoon oh I didn't think of that.. but still 😂👌
How private is this?