Set up a Local AI like ChatGPT on your own machine!

Dave's Garage

มุมมอง 105 678

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 1 ต.ค. 2024
Dave explains the reasons why and the steps needed to set up your own local AI engine ala ChatGPT. For my book on the autism spectrum, check out: amzn.to/3zBinWM
Helpful links and tips:
Install ollama:
curl -fsSL ollama.com/ins... | sh
ollama serve
WSL Kernel Update Package: learn.microsof...
Run the UI Docker Container:
Run UI with Docker:
docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama

ความคิดเห็น • 740

@NachosElectric 3 วันที่ผ่านมา ⁺¹³
Dell Price: $35,594.36
Shipping: Free
Thank God the shipping is free.
@AFNacapella 8 วันที่ผ่านมา ⁺²⁷⁹
"open the garage doors, HAl."
"I'm afraid I can't do that, Dave."
@lonewitness 8 วันที่ผ่านมา ⁺⁴
Open source ai models are more like Jarvis than Hal.
@javabeanz8549 8 วันที่ผ่านมา
@@lonewitness just watch out for The Riddler
@dighawaii1 8 วันที่ผ่านมา ⁺⁹
Dave? Why are you doing this, Dave?
@ApeStimplair-et9yk 7 วันที่ผ่านมา
@@javabeanz8549 nope it is the candyman for special Art-E-Fish'ale Philantrophy's.
did anybody seen karl marx at the sicknuts from disney ?
@Roberto-SergeiIVVonYamashita 7 วันที่ผ่านมา ⁺³
Lots of cleverness in these 2 lines... Well played Mr. fake W
@TheBugkillah 7 วันที่ผ่านมา ⁺³⁸
I’m cool with living vicariously through Dave.
@kenniejp23 5 วันที่ผ่านมา ⁺⁶
Installing docker via snap caused me issues with not being able to access my Nvidia graphics card to run the AI.
I believe this is because I'm running an "unsupported" Linux version.
Installing docker via "apt" fixed this.
@davidgreen9834 7 วันที่ผ่านมา ⁺²⁹
So, for everyone who is struggling to get the wsl 2 set as your default, you need the command "wsl --set-version 2" in your PowerShell. Spent an hour figuring it out and know this will save a few headaches. Thanks for the video Dave. I hope to get this operational before bed.
@JustinEmlay 4 วันที่ผ่านมา
New? It's been out for over 4 years now ;p
@davidgreen9834 4 วันที่ผ่านมา
@@JustinEmlay Thanks for pointing out the spelling error, I typed that late into the night.
@AIG-Development วันที่ผ่านมา
How do we link the Ollama AI to the OpenAI, seemed to skip that part?
@DataIsBeautifulOfficial 8 วันที่ผ่านมา ⁺¹³⁶
So, we’re just casually summoning AIs at home now?
@KimForsberg 8 วันที่ผ่านมา ⁺¹⁷
Been doing that for a while. Not too hard. The largest problem is having a good enough model to run that runs on what is a reasonably priced home computer.
@TheRex42 8 วันที่ผ่านมา ⁺³
lol I love this in this context
@phils744 8 วันที่ผ่านมา ⁺²
That's amusing, "saying, casually summoning ai models" like requesting your own personal butlers to remove the plates off the table once you are done eating. Or like using Uber, where is my ride "alexa" I called for it 30 minutes ago 😊
@GungaLaGunga 8 วันที่ผ่านมา ⁺⁷
Daemon, not a demon.
"It's the work of the devil" - Mama Boucher
No it's not. It's just zeros and ones. On's and off's.
Same voltages and vibrations, vibes and grooves as the rest of the universe. The oneness is us. Oh eye see.
@GrayeWilliams 8 วันที่ผ่านมา ⁺⁹
Summon sounds too archaic. Wait, not archaic enough.
We invoke them.
@researchandbuild1751 4 วันที่ผ่านมา ⁺⁸
Why use WSL? Olllama has a windows installer.
@charleswaters455 8 ชั่วโมงที่ผ่านมา
I've been using LMStudio which also runs natively in Windows.
@eugene3d875 6 วันที่ผ่านมา ⁺²⁴
And just like this, 13minutes lead to an evening of successful tinkering. Thanks for the inspiration!
@heyheyhophop 4 วันที่ผ่านมา ⁺²
Glad to see some of us have 50K worth of hardware at hand 😅
@eugene3d875 4 วันที่ผ่านมา ⁺²
@@heyheyhophop lol, just use your gaming machine, it still works quite fast. I certainly don't have the same beast of a machine.
@heyheyhophop 4 วันที่ผ่านมา
@@eugene3d875 Right, was kidding, I hope my 12GB 3060 and 48GB of plain RAM should let me go relatively far -- esp. as now the layers / whatever can be partially offloaded to CPU, as far as I understand
@EugeneShamshurin 4 วันที่ผ่านมา ⁺²
@@heyheyhophop I found that 48 GB would be sufficient. I'm running inference on CPU only, due to incompatible graphics card, and it still performs quite well, while keeping the total RAM load under 32gb. So I think your setup will do great
@heyheyhophop 4 วันที่ผ่านมา
@@EugeneShamshurin Many thanks for letting me know
@gertleusink5125 8 วันที่ผ่านมา ⁺⁴⁶
The powershell command is wslinstall
@toploaded2078 8 วันที่ผ่านมา ⁺⁴
Thanks!
@Mopharli 8 วันที่ผ่านมา ⁺¹
Thank you, I found this after it didn't work, and it still didn't, but then tried "wsl --update" which then started the install.
@Mopharli 8 วันที่ผ่านมา
... as I notice it says on Dave's very next slide!
@OldPoi77 8 วันที่ผ่านมา
comments coming to the rescue ;)
@Mopharli 8 วันที่ผ่านมา ⁺²
...and if you're having issues running that large Docker command copied from the video description, it has "sudo" missing from the start of it... and make sure you run it from the initial wsl command line rather than any other you may have opened.
@markkenefick644 7 วันที่ผ่านมา ⁺¹⁶
Dave, As a retired Deccie, I love your appreciation of the pdp11. Worked on many pdp 11/34's way back when. Oh! and love the shirt.
@tsdbhg 8 วันที่ผ่านมา ⁺⁹
Thank you. This helped me create my current girlfriend.
@Vilvaran 7 วันที่ผ่านมา ⁺⁹
I had a feeling this was the Ollama model - I can verify as a Linux user that the install for this is as simple as installing ollama from the package manager / flathub; then running the two commands; ollama serve, then 'ollama run' - which automatically fetched the repository if it is not already there...
Two *very useful* commands within the chat interface are /load and /save. You can keep your AI 'alive' and contextually relevant by saving it before exiting.
5 minutes is my average prompt time, if anyone asks...
@JasonKingKong 8 วันที่ผ่านมา ⁺³⁸
Sweet PDP11 shirt.
@Dirtyharry70585 8 วันที่ผ่านมา ⁺²
Just thinking the comparable size of a digital pdp11 to that thread ripper unit.
@TroySkirchak 7 วันที่ผ่านมา ⁺¹⁵
Please make more videos about this subject.
@tubeDude48 8 วันที่ผ่านมา ⁺¹⁸
For those that don't know, (if you installed Debian), replace 'snap' command with the 'apt' command. BTW- Debian does all of this quite well also.
@JimmyS2 6 วันที่ผ่านมา
I believe you can use apt and snap both on Ubuntu, since it's Debian based, and snap is developed by Canonical.
@tubeDude48 6 วันที่ผ่านมา
@@JimmyS2 - use Mint, so snap isn't used.
@christiandior8726 6 วันที่ผ่านมา
I LOVE YOU! Stay awesome!
@christiandior8726 6 วันที่ผ่านมา ⁺²
Snap didn't work but APT did! Now getting the following error after trying the web-ui command.
docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
Asked llama and chat-gpt for answers but they recommend systemctl commands that do not work on WSL 2 Ubuntu (I think). Is there any solution? (Really grateful for your time!)
@MrNilOrange 8 วันที่ผ่านมา ⁺¹¹
This is brilliant Dave. But you are underestimating your own expertise and some of the configurations you already have in place on your machine - or simply do automatically. So lots of failures and error messages. But if you are an ageing weird computer nerd like me it's fun sorting it out :-) Thanks.
@Konrad-z9w 8 วันที่ผ่านมา ⁺¹⁰
Sitting on an airplane with a laptop or in your shop next to a threadripper is exactly the same noise level.
@solarisone1082 8 วันที่ผ่านมา ⁺¹⁴
Your computer’s specs make me want to cry.
@robbybobbyhobbies 8 วันที่ผ่านมา ⁺¹
Just wait a couple of years and it’ll be commonplace. Of course you’ll still be upset by his future setup, but your computer will be this powerful.
@alok.01 วันที่ผ่านมา
@@robbybobbyhobbiesMakes me want to wait a few years so ai compatible tech can mature and become cheap
@SweNay วันที่ผ่านมา ⁺¹
@@alok.01 I saw some news somewhere that the cost or depreciation of AI development was like 97% which made it the fastest dropping market in history, so we're going there 😅
@osterbybruk 8 วันที่ผ่านมา ⁺¹⁰²
Just casually throwing out a 13min video that can completely transform your life and business... that's so Dave.
@FlintStone-c3s 8 วันที่ผ่านมา ⁺³
Well he is on the Spectrum so does this all the time, no big deal , ha ha. My family has no Idea why I get so happy using Ollama on my Pi5.
@BastetFurry 7 วันที่ผ่านมา ⁺¹
@@FlintStone-c3s Ollama on a Pi5? You are either a very brave or a very patient person.
@craigknights 7 วันที่ผ่านมา ⁺¹
I need to do some reading, but what are people actually using it for? I can't think of what I might ask it to do.
@KimYoungUn69 2 วันที่ผ่านมา
@@BastetFurrysays nothing, its all about the model
@KimYoungUn69 2 วันที่ผ่านมา
@@craigknights You can ask it what to ask
@jaz093 7 วันที่ผ่านมา ⁺⁸
Yes. So glad you're covering this topic. Being able to use the files on your own PC without having to upload those files to other companies servers
@OceanusHelios 8 วันที่ผ่านมา ⁺¹³
That was excellent and I was able to get it up and running just like your instructions concisely provided. After trying it for several hours, I can say that it isn't a bad language model at all.
@markae0 7 วันที่ผ่านมา
Can you remove the adult content limitation?
@andrewperkins2083 8 วันที่ผ่านมา ⁺¹⁰
Dave, would you be willing to do a follow up video outlining a few lower levels of hardware? You don't even have to run the model on them (although that would be awesome), but describe some machines say in the 1k, 5k, 10k, and 25k range?
@benjaminlynch9958 7 วันที่ผ่านมา ⁺³
I’ve been running this exact setup in Linux (Pop!OS) for a couple weeks now, and it works fine on any modern (eg less than 6 years old) hardware. Initially I tried it with just my CPU (Ryzen 5800x), and it was fine albeit a little slow. But definitely usable. After I enabled GPU acceleration on my nVidia 2070 Super, the responses came back stupid fast. Like 10x faster than I could read them.
The only thing I would note is that either CPU or GPU (whichever one is enabled) is going to be pinned at 100% utilization while responses are being generated. The practical effect is going to be not trivial power draw, and for laptops much shorter battery life unless the unit is plugged into the wall. But don’t let that put you off. Even modest hardware ($1,000 PC brand new 5 years ago) is more than sufficient. Just be aware of your battery level if you’re going to do this on a laptop.
@kuromiLayfe 6 วันที่ผ่านมา ⁺¹
Ollama recommends a 10th gen i5 CPU or AMD equivalent and a NVidia GPU 20xx with 8GB or more VRAM.
Make sure your windows drive or host drive has enough diskspace as the models can easily rack up 100-400 GB at out of nowhere.
@realmstupid-on8df 6 วันที่ผ่านมา ⁺⁵
I spent 5 hours trying to Google this...to find no real answer. Then this. Thanks.
@loop8946 8 วันที่ผ่านมา ⁺¹⁰
not sure just how much work it would have taken but with everything running through docker it likely is easy to test it on a lower end machine. While I think it's really cool to see performance an a machine that i may have something comparable to in 15-20 years... it would also have been informative to see the performance on anything close to normal consumer levels of performance as a comparison.
@The_Craphound 5 วันที่ผ่านมา ⁺¹
I'm running it on (don't laugh...it's paid for!) a Dell Opti790 with 32GB RAM (yes, you CAN install that much) and a poor ol' RTX 1070Ti vid card and it works like gangbusters! The graphic card is modestly overclocked. Turned off the overclocking and I'm not sure if it ran with just a slight performance hit or maybe that was just my imagination. Point is, although there's no doubt that the Threadripper would leave my system PAINFULLY smoked in the dust on more intensive work, the basic chat works great as it. For something that's isolated from web learning, I was very surprised in it's breath of knowledge (it even knew what WSL is...) A roughly 8+ year-old machine will run this configuration just fine...!
@SixplusonemediaAu 5 วันที่ผ่านมา ⁺¹
Yep, can confirm that it runs really well on my older AMD Ryzen 3600 with 32GB RAM and a 2070 GPU.
@c0d3warrior 5 วันที่ผ่านมา ⁺²
And just like that I've got an AI running locally on my machine. Feels kind of weird tbh. Awesome video guide, thanks a lot!
@chrisparker5712 6 วันที่ผ่านมา ⁺²
You can't load the open-webui container with the snap version of docker. Kinda disappointed in you Dave.
@ahmetrefikeryilmaz4432 8 วันที่ผ่านมา ⁺⁴
I did everything though I had to install docker for windows and use it's WSL integration, I can display the web gui but there is no model available there whilst it's working in the CLI.
@douglascaskey7302 8 วันที่ผ่านมา ⁺⁴
Would have been funnier if the question was "What is the airspeed velocity of an unladen swallow?'"🤭
@SoloGuitar1000 8 วันที่ผ่านมา ⁺⁶
When I run the docker command, after downloading, it gives me the following error:
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.
@ephemerallyfe 8 วันที่ผ่านมา ⁺¹
Same here. Would love to learn how to fix this.
@lyndenp 7 วันที่ผ่านมา ⁺²
Same, frustrating. Any ideas anyone?
@jqoutlaw 7 วันที่ผ่านมา ⁺¹
Remove the --gpus=all parameter in the docker command. I have this running on a VM in proxmox using just the CPU and it fixed my issue.
@SoloGuitar1000 7 วันที่ผ่านมา
@@jqoutlaw Thanks. That worked.
In my casual sleuthing of the problem, it looked as if the NVIDIA gpu had something to do with it. I found that I have an AMD Radeon 780M gpu, so I looked to see how to run it with that, but none of the solutions I found worked.
So I guess I'll just run it using the cpu instead.
@jsflood 6 วันที่ผ่านมา ⁺¹
My guess is that it's complaining about not finding the NVIDIA CUDA toolkit (only works if you have an nvidia gpu as @jqoutlaw mentions). Also do an update/upgrade: 'sudo apt update; sudo apt upgrade'
@DrCognitive 2 วันที่ผ่านมา ⁺²
Wouldn't it be easier/better to take a machine and just run Linux as a default OS without using a virtual machine and sharing resources with your main OS?
@alok.01 วันที่ผ่านมา
Yeah it is. WSL 2 has many disadvantages if any app doesn't support it. And to my knowledge only vscode is supported other than the terminal ofcourse
@TKevinRussell 5 วันที่ผ่านมา ⁺¹
This is pretty cool. I installed it on a Lenovo laptop, Windows 11 Home, 13th Gen Intel i7-1355U,10 core, 16GB ram, SSD. Runs decent enough to experiment with. I am only running from a command prompt.
@TheTuubster 8 วันที่ผ่านมา ⁺⁶
Already running InvokeAI (Stable Diffusion) and text-generation-webui (for Llama) for months locally. This is the first year I especially bought a GeForce gfx card (with 16GB RAM) not primarily for gaming but for Generative AI. The times they are a-changing. ;)
@FlintStone-c3s 8 วันที่ผ่านมา
Stable diffusion runs on Pi5 8GB, a bit slow, 3 minutes per image. Hoping the Hailo AI Hat can run it faster.
@TechDunk วันที่ผ่านมา ⁺¹
I personally really like using LM Studio. It does everything, including downloading and loading models in a simple UI
@tonylewis4661 8 วันที่ผ่านมา ⁺²⁵
How can we be sure that Dave made this video, and not his AI model?
@DavesGarage 8 วันที่ผ่านมา ⁺³¹
That's what a bot would say!
@adjoho1 7 วันที่ผ่านมา ⁺⁴
@@DavesGarage sounds like something a synth would say.
@erickdanielsson6710 8 วันที่ผ่านมา ⁺⁴
Thanks Dave, I have a couple RHEL Linux systems at work I shall try this.
@JenniferBishop-ty6tt 4 วันที่ผ่านมา ⁺¹
For those with less hardware, the Llama 3.2 3B Instruct model is good for chat and requires a lot lower specs to run. I am able to run it on GPU using an Nvidia GTX1070Ti with only 8GiB of VRAM. So far it has been on par with Llama 3.0 & 3.1 for my use while being a lot faster. To get it, run the following: ollama pull llama3.2:3b-instruct-q4_K_M
@adamsnook9542 2 วันที่ผ่านมา ⁺¹
Thank you very much for the video. FWIW, your final incantation to launch the web UI didn't work on my system - something to do with packages installed with snap not being able to see the graphics card drivers (?) Following the installation instructions on Open Web UI's page seemed to do the trick though.
@Billwzw 7 วันที่ผ่านมา ⁺²
Thanks - now pretty please download the most ridiculous model available (maybe Llama3.1:405b ?) and show us what killer hardware can do !
@mookfaru835 4 วันที่ผ่านมา ⁺²
Did your wife divorce you for buying that machine? LoL
@DavesGarage 4 วันที่ผ่านมา ⁺¹
It's a loaner! I can't afford that kind of hardware!
@Machiavelli2pc 5 วันที่ผ่านมา ⁺¹
thank you. Decentralized, uncensored, 100% private, etc. AI/AGI really is important.
‘The path to hell is paved with good intentions’ is a quote that comes to mind when I hear governments, corporations, etc. trying to limit freedoms of individuals. AI is something that is too important to not have individuals be able to have absolute freedom over their own AI’s/AGI’s.
@RockBrentwood 2 วันที่ผ่านมา ⁺¹
Well, toss aside the speed/hardware/cost issue. That's something you can grow into or even Moore's-Law your way into (i.e. be patient).
The *real* issues are: dialog & scripting with persistence that's even forkable, so that you can maintain contexts just like people maintain repositories and even do multi-party engagement with it. That's especially the case, if you're going to put up a public-side interface to this.
That's a primitive form of knowledge-base integration, which gets to the next item: real-time updating and learning, not just training by some batch process at "update time". That's not trivial, and it's an issue that is *independent* of the scaling issue alluded to in the first sentence: how to integrate short-term memory into the long-term memory that is the model, itself. Trainability is also an important issue, yes, but I'd be more concerned with the ability to interface with components for a hybrid architecture that includes a logic and math engine and knowledge-base engine.
The "advantages" that the greater resources put into the major AI-providers' models diminish exponentially because of the neural scaling law, so you can go a long way to getting into the same ballpark as them, without the blow-up up in resources that they have or used to get there. Hybridization could blow through that wall, slingshotting right past the big players, if it's done right - in a race to get there before they do. A model of your own is good, but you really need hooks into these other things to go with it, or you're just cosplaying OpenAI in the minor leagues. I want to move this to a more modular form, actually, as curriculum training; and also to mold a personality type. An already-provided pre-trained model is just a starting point to launch this off of, but only if the extra hooks are integrated into its design.
@johnmiller3665 8 วันที่ผ่านมา ⁺⁴
I'm sorry Dave, I'm afraid I can't do that!
@tomaselke3670 8 วันที่ผ่านมา ⁺⁵
Thanks!
This is the tutorial I didn't know I needed, until now.
@angrd020 8 วันที่ผ่านมา ⁺¹⁵
Even though I've been running local inference and RAG for about a year now I still stopped everything to listen to Dave's explanation.... Because Dave..🕺🤖
@DWSP101 8 วันที่ผ่านมา
What I wanna know is if I can get an AI on my steam deck that I can use as a personal assistant computer for creating Contant strictly based on all of the data and information I create I just put all my information inside of a single file of course it’s not gonna be a single file. There’s multiple different layers of stuff. I’m going to put it of course information but I just wanna be able to give like a custom GPT or llama three all this information and just have it so that it’s a personal assistant for one topic alone, but it’s an expert in that topic.
@PaulAppleyard 5 วันที่ผ่านมา ⁺¹
Docker UI Container command: You'll probably need to add 'sudo' at the start
@BrianBellia 5 วันที่ผ่านมา ⁺¹
It would be great if you could talk to it and activate it using your voice, just like Siri.
That's what I'm waiting for. 🤞
@jamesscullin6388 8 วันที่ผ่านมา ⁺¹
All the Linux nerds that see wsl and hear windows second language because Linux is primary.
@88spaces 4 วันที่ผ่านมา ⁺¹
Wow, Dave! I had no idea. I don't have the beast machine you have but I'm going to put this to use. Thanks.
@jamndude 2 วันที่ผ่านมา ⁺¹
Hey Dave, as an former employee of Digital Equipment Corp for over ten years, I love the t-shirt.
@blakemcbride26 7 วันที่ผ่านมา ⁺¹
Thanks, but how do you add your own information like information about your own large code base?
@no-bc4kc 4 วันที่ผ่านมา ⁺¹
Great vid...now all i need is to free up 5Gb of space XD
@brennofrank 5 วันที่ผ่านมา ⁺¹
Hello Dave, can you share the link where you got the t-shirt ?
@vbisbest 5 วันที่ผ่านมา ⁺¹
Great tutorial. Would like see to one on how to create a custom model or add training to an existing model.
@harryheinisch3446 8 วันที่ผ่านมา ⁺³
Love the shirt, was there slightly after PDP days but got to see the old Alpha starting with EV54. You made this too easy to install on a laptop :)
@Bp1033 8 วันที่ผ่านมา ⁺³
I've been running LLM locally for a while. The most useful thing i made it do was create weather reports from weather data from NWS.
I have 2 rtx 4060 ti (16gb each) and my old rtx 2060 (6gb) in my server. Its really just a gaming desktop moonlighting as a server, it can run a 70B model decently though.
@organicdinosaur5259 4 วันที่ผ่านมา ⁺²
How do you like the complexity of the answers you get? Really annoyed openai is forcing people to use their hardware especially since most people use chatgpt for personal use. Is it worth it to setup something locally?
@Bp1033 3 วันที่ผ่านมา
@@organicdinosaur5259 its pretty good honestly. I'm running a Q4 model (Llama-3.1-70B) but despite that its rather accurate. For me personally, I say its worth it. mostly because I can just download a random model then throw it on the server so you're not just stuck with chatgpt. Qwen is a really good general model, comes in a bunch of sizes.
You can also run most 8B and lower models on CPU at a pretty brisk speed, but its better if you have a GPU with at least 8gb of vram. ollama is super easy to setup on both windows and linux so its absolutely worth at least giving it a shot.
@TheInternetHelpdeskPlays 8 วันที่ผ่านมา ⁺²
Amazingly, I have an Arc A770 and it comes with an AI playground so I could build my own graphic maker and chat system.
@Glademist 3 วันที่ผ่านมา ⁺¹
I run into this every time i try to install docker: docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown. :(
@oz457 10 ชั่วโมงที่ผ่านมา
same here
@Minglarr 7 วันที่ผ่านมา ⁺³
You are absolutely the best at explaining everything so simply and well! Thanks for another great clip!
@skunked42 8 วันที่ผ่านมา ⁺³
Dave, getting close to 1mil subscribers!
@christiandior8726 8 วันที่ผ่านมา ⁺³
8:30 Dave flexing 388MB/s downloading
@Zyphera 4 วันที่ผ่านมา ⁺¹
Oh this is interesting! Please more of this, Dave!
@TheRex42 8 วันที่ผ่านมา ⁺¹⁰
I've been waiting for a straightforward tutorial like this to share after setting one up myself! Mistral has an amazing 12B small model that most GPUs can run.
@seanreynolds1266 8 วันที่ผ่านมา ⁺²⁵
I'd like to add that ollama plays very nice with the Continue VScode extension which means....private local github copilot too!
@TheMusicPoint 8 วันที่ผ่านมา ⁺³
This is incredible, exactly what I have been looking for! ❤
@jamesross3939 6 วันที่ผ่านมา ⁺¹
Next video: how to train your own model!
@jdouglas4564 2 วันที่ผ่านมา ⁺¹
Didn’t understand a word but it was interesting
@Thag8Abrick 4 วันที่ผ่านมา ⁺¹
braincell explosion in my head
@VCRgameplay 8 วันที่ผ่านมา ⁺²
Just a little note: the wsl arguments like "install" require double dash like "wsl --install" in the video you show one dash for type I guess
@guiduz3469 8 วันที่ผ่านมา ⁺²
I bet you’re gonna hit 1M subs with this one! Deal for the next video on how to train your local AI? Also, how much power does that beast of a workstation pull? Can't to see how long it'll take to get a response on a human desktop...
@terpcj 8 วันที่ผ่านมา ⁺²
I've been using Ollama on my PC for a while now (I opted for a Windows install with AnythingLLM as my front end...easy and no Linux needed). It's pretty good over all. Not bleeding edge. Maybe not even cutting edge. Regardless, it does fine if you pick the right model(s) for your needs. It definitely wants to stretch its legs, though. More disk space (for larger versions of models) and more CUDA cores (speed, baby) are definitely more better.
@stevenhawkins179 8 วันที่ผ่านมา ⁺²
Great informative video. I used the "cheap" and almost premade option of an Intel Arc A770 with their AI Playground program.
@BellaBardocz 5 วันที่ผ่านมา ⁺¹
Got inspired.Thanks.
@RussFryman 8 วันที่ผ่านมา ⁺²
Thanks for this. I've been running LM Studio on my windows box and experimenting with a few different models. Was looking for inspiration to build a docker based AI server, and this hit the spot.
@Bilut 8 วันที่ผ่านมา ⁺¹
Or you can just install Msty :)
@excavate08 2 วันที่ผ่านมา ⁺¹
‘Do you want to play a game?’
@micahvanella2938 8 วันที่ผ่านมา ⁺²
Finally! I've been looking for a way to learn to make a GPT AI to consume rulebooks and modules for Old School Essentials so I can ask it questions and generate random encounters.
@AliSimplyLive 3 วันที่ผ่านมา
How about you make this all in your OS instead of explaining? Would have been better. Do this and that is not really beginner friendly.
@bigpickles 8 วันที่ผ่านมา ⁺⁶
Openwebui is the bees knees. I have every API and local model running through it for the past month and i just love it. I don't use docker rubbish though, much easier install on Linux
@joeysartain6056 8 วันที่ผ่านมา ⁺³
Love the "digital" t-shirt
@gregsb3454 6 วันที่ผ่านมา ⁺¹
Another great episode
@MrKillerno1 6 วันที่ผ่านมา ⁺⁴
In the early 80's I had a program called 'Whatsit?' maybe you know it, it was a early learning piece of software running on CP/M that did the same, it learned as long as the things you put into it. AI is based on these efforts. Tried later to make a similar program in AmigaDOS with a friend and it was fun. Just text base.
@TheBodgybrothers 4 วันที่ผ่านมา ⁺¹
There is no way you made anything like LLMs on a CP/M.. Imagine thinking you invented something on a computer that could barely have enough ram for a primitive OS and that has only been in research for the last 10 years.
@stultuses 3 วันที่ผ่านมา ⁺¹
That program's ability to learn was really a simplistic classifier, in that it had limited scope and could not really learn
AI is based on a lot of things that have gone before us, Whatsit included, although it was more that Whatsit was based on other fundamentals in it's time
@stultuses 3 วันที่ผ่านมา ⁺¹
@@TheBodgybrothers
LLM's have been around as a concept well beyond 10 years!
There have been LLM's that ran in a batch mode that were large but because of the limits of systems and ram, they were so slow as to be unusable but they existed
In regards to MrKiilerno1, of course he didn't run an LLM on a CP/M machine, I don't think that was his point, he was referring to a system that engages in a conversation and that tracked that conversation across multiple interactions, in this regard, he is correct, role-play games and learning systems have been using this for decades now
In terms of Ram, there are older OS's who can easily run programs beyond the constraints of their system memory, OpenVMS for example has both swapping and paging mechanisms. OS's now focus on speed so they demand more memory rather than use concepts like swapping to run extremely large programs but systems of old use to run large application in very limited memory. I worked on one that had 16K of memory and ran a whole accounting ledger for a large municipality of over 1 million people
@MrKillerno1 3 วันที่ผ่านมา
@@stultuses And still I had a lot of fun with it. Most days when I talk to Alexa, Siri or Google, they tend to do sometimes what they want. I appreciate the way to communicate with them by voice, this on itself, I think, is a masterpiece in programming.
@MrKillerno1 3 วันที่ผ่านมา
@@TheBodgybrothers That time I was working for a company that made medical database software, to store all their data about medicine and patients in. It had to be big machines, they were very pricy at the time and had a large storage device on them, harddrives. It was also the time 16 bit computers were coming and Microsoft took hold of many branches. Luckily this software evolved into the current database system it is these days. It all started by one man and his machine, selling his product and hardware to numerous institutions and health practitioners (doctors offices). As long as you had the thousends of dollars, you could buy it.
@QuantumKurator 7 วันที่ผ่านมา ⁺¹
My wife would kill me.
@rushank2112 6 วันที่ผ่านมา ⁺¹
I stay away from snap
@henson2k 8 วันที่ผ่านมา ⁺¹
Ollama has Windows client
@VCRgameplay 8 วันที่ผ่านมา ⁺²
I would hug you if only for the amazing person you are
@hstrinzel 8 วันที่ผ่านมา ⁺¹¹
Wow, THANK YOU! Great video again! Would be interesting how much faster YOUR SETUP is compared to my 10 core Laptop, 32GB, and a 4GB 3050 . Right now it kinda crawls on most questions, but one can always come back 10 minutes later. The amazing thing is IT DOES give answers standalone.
@Planetdune 8 วันที่ผ่านมา
Still pointless then... I'll keep using CoPilot for now..
@benjaminlynch9958 7 วันที่ผ่านมา ⁺³
I bet your issue is the 4GB of VRAM on the GPU. I ran it on just my CPU (5800x, 8 core Zen 3), and responses took less than a minute with no GPU acceleration. You might get better performance by cutting out the GPU entirely and letting it run just on the CPU so the model doesn’t have to load into VRAM piecemeal on every query.
@diycarcorner 3 วันที่ผ่านมา
Nice video, but a few parts i did miss:
Installing nvidia driver support in wsl ubuntu for docker. Without that it will not use GPU acceleration!
You also did miss the part that to run docker as non superuser you need to configure permissions of docker (adding non superuser to group)
Alternative manual installation of docker in wsl ubuntu would be nice snap did not work for me
@dziban303 7 วันที่ผ่านมา ⁺¹
love the shirt
@shedwork 7 วันที่ผ่านมา ⁺¹
I want your Digital tee Dave. "Best job I ever had..."
@moondoggie1968 8 วันที่ผ่านมา ⁺²
Fast Hardware Working Hard. Me too!
@samshort365 7 วันที่ผ่านมา ⁺¹
Future me installing this on my $100 quantum computer set-top box and thinking how quaint Dave looked installing this on a $50k server. Now, if only I had a time machine. Thanks Dave, really cool!
@FlintStone-c3s 8 วันที่ผ่านมา ⁺¹
Don't have Dave's budget? I run LLM's on Raspberry Pi5 8GB. Some tiny LLM's will run on the Pi4 4GB. About to order the Hailo AI HAT as they are now available Downunder.
@AlloffroadAu วันที่ผ่านมา
can you make the same for the Mac and also what models are best to use for what purpose?
@markridlen4380 7 วันที่ผ่านมา ⁺¹
And if you run Linux, you can skip the whole WSL install part :) This is great... hopefully it runs on my laptop.
@MikeMacYT 3 วันที่ผ่านมา
Sorry, Dave. Respect for the era you were involved with MS, but I bailed at your first mention of Windows. Nope. Never again. My only involvement with that mess is fixing the problems it constantly creates with my elderly parents' systems.
@qwerty3663 7 วันที่ผ่านมา ⁺¹
Are you going to do a subsequent video that compares performance of this high end loaner with you're personal computers and something we might actually have (say
@supremepartydude 8 วันที่ผ่านมา ⁺²
Great stuff Dave. Love your channel
@lapis.lazuli. 8 วันที่ผ่านมา ⁺¹
Dave' gone in a really round about way to run ollama, without much explanation. For 99.99% of you just install ollama directly into windows without going through WSL. WSL is good for some things rather than others. But why spending 30 minutes setting up WSL to run when ollama can be installed in 30 seconds?
@Donald.Archer 5 วันที่ผ่านมา ⁺¹
Does this only work with NVIDIA cards?
@DavesGarage 5 วันที่ผ่านมา ⁺¹
No. It uses the GPU on my Mac quite nicely, even!
@robertscheer3002 วันที่ผ่านมา
On Debian it configures automatically and works with AMD ROCm as well, at least on my W7900 pro card with 48GB VRAM.
@franciscovarela7127 8 วันที่ผ่านมา ⁺²
Lost me at the 50K machine.

ต่อไป

เล่นอัตโนมัติ

Run Local LLMs on Hardware from $50 to $50,000 - We Test and Compare!