Project idea: create a bunch of agents that are experts in specific areas, like coding, Wikipedia, reasoning, law, etc, and then an orchestrating agent. The orchestrator will be the only one the user then interacts with. The orchestrator then figures out how to respond to user queries by finding all available agents available and selecting either one or multiple to produce the best answer possible. Either the agents have descriptions of what it’s good at or if the orchestrator can see each agents metadata and even just recognize what they’re good at by just seeing how they’re setup, that would be even better.
There are coding copilots/agents out there already, like Pythagora GPT Pilot, Devin, Devika, Auto-GPT, Github copilot, Warp...etc... Personally, I rely heavily on the current Claude 3 Opus, OpenAI ChatGPT 4 looks like a joke next to it! 😅
Yes. a specialized and optimized LLM will outperform a general model. If one could successfully train custom specializations, having a constellations of models for specialized tasks could result in very high capacity (like a model that is 7B for Cyberpunk story structure, 7B for dialogue in a Cyberpunk setting, 7B for pacing in adventure stories, etc, you could have a a setup running like 28B for only making a cyberpunk stories, but only running a single model at a time.
I've been using LMStudio since your last tutorial on it, and I can attest that it's FANTASTIC to use and takes all of the headache out of setting up local AI's. TIP: In LMStudio's settings you can specify the folder to download AI models to. It's worth getting a small dedicated flash drive to store them on. That way you can play about with them without having to worry about hard drive space as the smallest models are about 5GB and the largest can get into tripple digits. Yes loading them up will take slightly longer, but inferrance won't be affected as that's done entirely from RAM (and if it's too big for the RAM then it will use your main hard drive for VRAM just like it would normally, so having it on external USB flash doesn't affect it).
Thank you for the video, and thank you for disclosing that you are an investor in both LMStudio and CrewAI. I wish you could mention it in the video for better transparency.
While you can run open source LLMs with LMStudio, LMStudio itself is *NOT* open source. I think you should probably clarify that in the video title, since it's a bit misleading.
I've re-read the title and caption multiple times since reading this comment and I'm confused as to what you find misleading? *Btw this is not an attack.*
@@Cross-CutFilms No problem, I didn't feel attacked at all ^^ Which I mean is that it can confuse people that read the title too fast and fail to understand that the Open-Source part refers to the models and not to LLMStudio (Like myself XD). When I read the title I thought: "Wait, so they open-sourced LMStudio? That's great! :D"... And then I visited their page looking for a repo, came back here and I realized I didn't read it right ^_^U I think it's very easy to understand the title wrongly as I did (even more for non native English speakers), hence my comment.
@@urphakeandgey6308If you meant my comment, sorry, I didn't want to be pedantic. I just wanted to clarify what the title wanted to say it's open source, since I got it wrong and there could be more people who also misunderstood the title.
I spent a good bit of time trying to get it to work but couldn’t. I’ll certainly do a review when I get it working though. Based on its popularity, I suspect it’ll evolve quickly.
How can good content from a genuine, ethical and knowledgeable person only have 200k subs in this niche when DS has 143k with his empty parroting grift? You are headed to ,,, broski, hope you're ready. You deserve it.
Hey Matthew, the video was really helpful and informative. I love how you recorded your screen and it zooms in and out. Is there a specific screen recorder you used for this, or was this done in post?
I think that was actually real. The presenter just wants attention, they want to get noticed when they speak (a form of narcissism). I found the product to be interesting and the presentation to be distracting (a negative because it draws away from what is being presented).
@@erikjohnson9112 actually I think you are the narcissist and are just projecting. Transgender people have existed in every region of the Earth since before civilization itself. They're a real naturally occurring demographic. Get over it
One thing which would be useful... how to sync models between LM studio and Ollama (to avoid duplication to save space). LM Studios defaults Local models in windows is: C:\Users\User\.cache\lm-studio\models When I point it to Ollama's in Windows 11 WSL2 ubuntu 22.04 it doesn't work ( \\wsl.localhost\Ubuntu-22.04\|usr\share\ollama\.ollama\models) Anyone know the answer? Is it even possible?
DO NOT USE ANYTHINGLLM it is no longer free! Im starting to see a pattern with this video creator. As soon as he releases a video these so called free things all of the sudden become paid for versions! I used AnythingLLM one night on this computer after watching this guys video on AnythingLLM the very next morning at work i went to put it on a computer at work and it was no longer free but subscription based! I AM DONE WITH THE MONEY GAMES BASED ON PRINCIPAL! THEY ARE ALL TRYING TO CASH IN ON THIS CRAP THAT AMOUNTS TO NOTHING MORE THAN A GIMMICK A AI TREND IF YOU WILL!!! BANKRUPT THESE PEOPLE!
Fantastic, I was just playing with this today. How would you integrate AnythingLLM with these different models running in parallel and interacting through crewAI?
Your new test for LLMs should be to ask them if they can write a piece of python that uses the openai api correctly. GPT4 even with uploading the correct docs fails. Claude fails even when updating with current api docs. crazy
They should get a visual node setup like comfy UI but for creating purely llm based apps. Some generate button or whatever spits out the python code (or just runs it) Not that it's hard to write, but some might prefer the node setup.
I'd love to see videos on running specific models. I've been able to run almost every model I've downloaded but I can't seem to get StarCoder or StarCoder2 regardless of the preset I use and I've love to get it running without hallucinations, or looping the same sentence. Another thing I'd love to know is what happened to TheBloke!! I heard he stopped converting models and I'm sure the reason behind has to be epic.
I am dropping this comment in case if someone is using this on Linux and have a old machine with built in GPU, I was facing this error where it says: "unable to load any model". A quick solution to that would be unchecking GPU acceleration. Hope this helps!
Wow finally an UI with documentation? I always hated that they throw all that GGUF, GPTQ, 4Q 5Q _K_M_S at you without ever telling what it means and what it needs to run.
I don't think that's how the compatibility filter works. I checked for a couple models that are in a repo with tensor files. They did not show up if that filter was active. I guess they include "does not work on LM Studio" as being not compatible with your system. Because it isn't compatible with their tool? At first I thought that it's not true that every model on HF could be found on there. But for those that were missing, I then saw that they show up if that filter is off. Of course, this is a sample size of trying 3 models or something. May be outliers. But do any models that are not GGUF work on LM Studio?
Hey, can you share some advice for an undergrad to get into generative models, where to begin or rather what should i learn to understanding the working of LLMs and playing around with them?
I've used LM Studio (and also Jan) with some fairly large models, but...I haven't seen an explanation on how to set up multiple languages (and alphabets if necessary). That should be easy, geeks please let me know. The second question is: where I can find an app / code whatever so the interface is not text-based, but includes speech, both listening and speaking (again, in a few languages), or what I really want-- a streaming avatar (like at HeyGen or D-ID or others) which will listen and then more or less instantly move lips and face, syncing with a spoken answer in the same language as the question. I ask because I want to do this all locally--it's for education and we want to keep it cheap and off line, so middle school students can't go "elsewhere". Suggestions?
LM studio is a nice project. The only complaint I have is that it is not open source. People can use LM Studio along with lollms as it can be run as a server and they seem to get very good output. So yeah, this is a very cool and useful tool.
Is there any way to use all of the gaming PCs in my house to distribute the GPU load? Or, if I would like to use a cloud service for overhead, is there a way to run all this locally, but use cloud GPU when needed?
you want to double down on this theme and ride it for the next few years or five - local, uncensored, real time, able to scan a wide variety of file formats, p2p sharing of models and training data plus no token limits or api calls - the way to go. maybe you can even delve into some open source llm mesh networking or something - i think that once the hw/sw matures and declines in price a bunch and things like cxl become common it is really going to light a fire under smb sector - lots of innovation, increased productivity and discovery.
🎯 Key Takeaways for quick navigation: 00:00 *🌐 LM Studio simplifies running open-source large language models locally without coding knowledge, featuring a user-friendly interface and enhanced functionality.* 00:27 *🔄 The new update allows running multiple models simultaneously, making the process straightforward.* 00:54 *💻 LM Studio supports Mac, Windows, and Linux platforms, facilitating easy download and installation.* 01:08 *🔍 The software integrates with Hugging Face to search and download various large language models.* 01:35 *🤖 Features tools to predict if a model will fit on your computer, enhancing usability for newcomers.* 02:13 *📚 LM Studio includes built-in detailed documentation to assist users in understanding and utilizing different model quantizations.* 03:09 *📞 Introduces an AI chat tab, similar to ChatGPT, that uses local models, offering an engaging user interface.* 04:20 *🌱 Includes new features such as Branch conversation for testing and quality assurance.* 05:15 *⚡ Demonstrates significant speed improvements using built-in hardware acceleration like Apple's Metal.* 05:41 *🎮 The "Playground" section allows loading and managing multiple models simultaneously to enhance outputs.* 07:45 *💾 Introduces JSON mode to better integrate model outputs into applications, promoting ease of API utilization.* 10:17 *🚀 Shows how to run models simultaneously without sequential limitations, leveraging powerful computer capabilities.* 11:55 *🌐 Offers API capabilities to manage and execute model completions, extending usability for developers.* 13:14 *🤝 Highlights collaboration with other platforms like Autogen and potentially Crew AI, indicating LM Studio's flexibility in multi-agent environments.* Made with HARPA AI
Amazing to me how little RAM the models use. Time for me to get a maxed M3 Max MBP I guess, although gonna wait till after the May 7 event JIC. (I know very unlikely to have any MBP hardware impact, but I’m cautious 🙂)
Doesn't work on my Pi5 but Ollama does. They only seem to use memory when answering a prompt, so I can multiple version of Ollama running with different models, as long as I only prompt one at a time. AMD Ryzen AI and Intel Core Ultra have NPUs onboard now so no need for big GPU card.
I want to build an agent team. At first i transcribe an audio file with whisper large v3 , the next agent can do a PowerPoint présentation. The next one can whrite a macro for PowerPoint an the next one can implémente it an generate a full PowerPoint présentation with template i have selected ! What do you think ?
Mathew, it would be great if there was a reference listing of all your videos so we could just pick a topic and find your videos that apply so we can watch them, again..... I think you did a video with agents before that did something simular, just wish they were easier to find
any idea of how I could share the models of ollama with this so I don't have duplicated models? also, for some reason, when you ask for a joke, it usually tells you the same joke... weird.. any idea of why?
LMStudio has an option in its settings to specify a folder to download the models to. I haven't tried it myself, but I would assume simply pointing LMStudio to the same folder you've been downloading to for ollama would be all you need to do as the models are a standard format from Huggingface.
How come the Nolan Arbaugh feed into GROK is not accessible? If my local setup can train off human BCI with how GROK trains then my local AI will be smarter-er.
6:58 does anyone know why LM studio uses VRAM instead of RAM in my case? LM studio says that it only supports full GPU offload, but Matthew is using RAM.
Really excellent videos on LMStudio. Does it have the capability to access local files to update chats with data newer than the cut-off date of the LLM? I'd like to be able to input locally stored ebooks and generate summaries along the lines of "What is 'This Book' about?"
Yes please, can you build a content series based in historical fun facts/events, and use any vision image or video generation model, and wisper or any free open-source, text to audio generation. And if possible train Ai agents to run a shorts video serious, having content creator agents, and video editor agent. To create at least 5 video content about historical fun fact series. #AitellsFunnyHistory .
Excellent video. Tqvm! Is there any video considering various option of hardwares for running local? Say various Nvidia GPUS, or AMD Rocm or even apple metal ?
Would be great if LMStudio supported Faraday characters. Faraday I find nicer for casual use as LM Studio is more heavy duty. Maybe LM Studio could have a "character mode" or something that was more like Faraday.
how can i fix this Error: A JavaScript error occurred in the main process Uncaught Exception: Error: The specified module could not be found: \\?\C:\Users\Administrator\AppData\Local\LM-Studio\...\liblmstudio_bindings_clblast.node at process.func [as dlopen] (node:electron/jsrc/asar_bundle::ATT) at Module._extensions..node (node:internal/modules/cjs/loader:17:1A) at Object.func [as .node] (node:electron/jsTc/asar_bundle:T:ATT) at Module.load (node:internal/modules/cjs/loader:1.97:PT) at Module._load (node:internal/modules/cjs/loader: PV:IT) at f._load (node:electron/jsrc/asar_bundle::irrr.) at Module.require (node:internal/modules/cjs/loader::19) at require (node:internal/modules/cjs/helpers:1+1:1^) at 97459 at r (C:\Users\Administrator\AppData\Local\LM-Studio\app-,,I^ esources\app\.webpa...:{07) (C:\Users\Administrator\AppData\Local\LM-Studio\app-,,IA esources\app\.web....:V••V)
There are soo many of these now though... This one. Jan. GPT Everywhere. (I can't recall from memory any others but I have tested at least 2 more in the last month and they ALL run around the same.)
I'm especially interested in the ability to generate multiple responses from the same model then selecting the best one. Can LM Studio do that at this time?
Can you make them discuss something together? Model 1 prompt: Discuss ethics with model 2 and agree with everything model 2 says. Model 2 prompt: Discuss ethics with model 1 and disagree with everything model 1 says.
This is awesome! Thanks for sharing again!! I have quick question... I want to run this on server like you showed, create a nocode app using APIs, and have users access this application. Kinda creating for a local group of users. How do you think this is going to work in terms of machine requirements. Please guide if it's good approach! 🙏🏻
...I am looking for a way to install LM studio on an external hard drive to have all the models on the external hard drive...I am trying to find a solution to do so also with Automatic 1111 for stable diffusions model....it would be great to have all portable!!!....to keep main hard drive resources free and to keep the models around...if it is possible a video abouti it should be GREAT!
Project idea: create a bunch of agents that are experts in specific areas, like coding, Wikipedia, reasoning, law, etc, and then an orchestrating agent. The orchestrator will be the only one the user then interacts with. The orchestrator then figures out how to respond to user queries by finding all available agents available and selecting either one or multiple to produce the best answer possible.
Either the agents have descriptions of what it’s good at or if the orchestrator can see each agents metadata and even just recognize what they’re good at by just seeing how they’re setup, that would be even better.
There are coding copilots/agents out there already, like Pythagora GPT Pilot, Devin, Devika, Auto-GPT, Github copilot, Warp...etc...
Personally, I rely heavily on the current Claude 3 Opus, OpenAI ChatGPT 4 looks like a joke next to it! 😅
Yes. a specialized and optimized LLM will outperform a general model. If one could successfully train custom specializations, having a constellations of models for specialized tasks could result in very high capacity (like a model that is 7B for Cyberpunk story structure, 7B for dialogue in a Cyberpunk setting, 7B for pacing in adventure stories, etc, you could have a a setup running like 28B for only making a cyberpunk stories, but only running a single model at a time.
I have to say I use Claude 3 Opus as a first choice for AI.
@@userrjlyj5760g I have both and I probably won't renew Opus as it's not giving me anything GPT doesn't, but GPT can do more
great idea! you could call it "mixture of experts"
I've been using LMStudio since your last tutorial on it, and I can attest that it's FANTASTIC to use and takes all of the headache out of setting up local AI's. TIP: In LMStudio's settings you can specify the folder to download AI models to. It's worth getting a small dedicated flash drive to store them on. That way you can play about with them without having to worry about hard drive space as the smallest models are about 5GB and the largest can get into tripple digits. Yes loading them up will take slightly longer, but inferrance won't be affected as that's done entirely from RAM (and if it's too big for the RAM then it will use your main hard drive for VRAM just like it would normally, so having it on external USB flash doesn't affect it).
Hey should I consider perplexity ai pro subscription to analysis my previous year questions of exam or free LMStudio will also be good please reply
does LM studio allow chat with documents?
I'm trying to set this up properly, but even Private GPT doesn't work as it used to
@@serikazero128Anything LLM is good for interacting with docs
@@myandrobox3427 thanks, I'll look into it
@@jayr7741 Use Perplexity for now, unless you have a really high-end machine that can run big models (2x 3090 at 24GB each).
We need more usecase and practical guides with LM studio. Love your videos. ❤
Thank you for the video, and thank you for disclosing that you are an investor in both LMStudio and CrewAI. I wish you could mention it in the video for better transparency.
Agreed. Would be highly regarded.
I considered this, maybe I should have. I didn’t want to be…show-off-y.
Are either of these public? Or are both private companies ?
I love how you move through topics and keep a concise summary of what is happening without going down rabbit holes. I learn a lot very quickly.
Thank you!
While you can run open source LLMs with LMStudio, LMStudio itself is *NOT* open source. I think you should probably clarify that in the video title, since it's a bit misleading.
I've re-read the title and caption multiple times since reading this comment and I'm confused as to what you find misleading?
*Btw this is not an attack.*
Unless he didn't mention it at all in the video, I think this is being ridiculously pedantic.
I switched to Ollama for that reason. It's open source and works like a charm.
@@Cross-CutFilms No problem, I didn't feel attacked at all ^^
Which I mean is that it can confuse people that read the title too fast and fail to understand that the Open-Source part refers to the models and not to LLMStudio (Like myself XD).
When I read the title I thought: "Wait, so they open-sourced LMStudio? That's great! :D"... And then I visited their page looking for a repo, came back here and I realized I didn't read it right ^_^U
I think it's very easy to understand the title wrongly as I did (even more for non native English speakers), hence my comment.
@@urphakeandgey6308If you meant my comment, sorry, I didn't want to be pedantic. I just wanted to clarify what the title wanted to say it's open source, since I got it wrong and there could be more people who also misunderstood the title.
Such an awesome software, can't wait to see the local open source software delivering agent and llms in 5 years, will be such a ride!
This is getting cooler by the day, what a gift !!! Thanks for your professionalism and dedication.
The system requirements are very helpful. Thanks
Love this type of video, thanks so much for really going into LMStudio. I"ve had the program for a dew months now but never really played with.
If you're using it for role-play also looking into Faraday, which is more setup for role-play and I find it runs the same models faster.
@@bigglyguy8429 thank you!!
I'm hoping in the future, there will be a way to train an LLM or specialised model more easily for a beginner. Almost iPhone-friendly.
No hope bro
Oobabooga has it in UI already. You just need to have hardware which can cope it
YES to the agents locally.
Any interest in checking out Devika? Claim to be open source Devin.
I spent a good bit of time trying to get it to work but couldn’t. I’ll certainly do a review when I get it working though. Based on its popularity, I suspect it’ll evolve quickly.
@@matthew_berman If you do make a video on devika please please do a section on using it with a local LLM
How can good content from a genuine, ethical and knowledgeable person only have 200k subs in this niche when DS has 143k with his empty parroting grift? You are headed to ,,, broski, hope you're ready. You deserve it.
Loved this video, please make a playlist out of it! More inclined for people like me.
Hey Matthew, the video was really helpful and informative. I love how you recorded your screen and it zooms in and out. Is there a specific screen recorder you used for this, or was this done in post?
Yes, please. Also publish your endpoints for consumers 🎉❤
Endpoints?
Always love your reviews. Thanks!
Great video, Matt - you have no longer jumped the shark. :)
Thanks Matt! You're my favorite AI TH-camr. I'd love to see you build something cool, we all would I'm sure.
I would really love to see document q&a using lmstudio, because I think a lot of companies are interested in this kind of ai use.
Matt you tricked us yesterday with that 01 thing and that voice distraction
I think that was actually real. The presenter just wants attention, they want to get noticed when they speak (a form of narcissism).
I found the product to be interesting and the presentation to be distracting (a negative because it draws away from what is being presented).
@@erikjohnson9112 actually I think you are the narcissist and are just projecting. Transgender people have existed in every region of the Earth since before civilization itself. They're a real naturally occurring demographic. Get over it
Such a good way to test drive some of the local models. Great job on this and all of your other tutorials! I've really learned a lot from you vids.
Thanks for sharing this update and demonstrating with the examples!
Thanks Matt! Great video as usual :) Yes please would be nice to see you build something with powering agents!
One thing which would be useful... how to sync models between LM studio and Ollama (to avoid duplication to save space).
LM Studios defaults Local models in windows is: C:\Users\User\.cache\lm-studio\models
When I point it to Ollama's in Windows 11 WSL2 ubuntu 22.04 it doesn't work ( \\wsl.localhost\Ubuntu-22.04\|usr\share\ollama\.ollama\models)
Anyone know the answer? Is it even possible?
Looking forward to seeing you build a project .
Wow, this is Gold! I would love a tutorial on how to integrate this into a website.
use autogen to create a series of agents with their agent builder to run a task
also you can use this with AnythingLLM for documents
DO NOT USE ANYTHINGLLM it is no longer free! Im starting to see a pattern with this video creator. As soon as he releases a video these so called free things all of the sudden become paid for versions! I used AnythingLLM one night on this computer after watching this guys video on AnythingLLM the very next morning at work i went to put it on a computer at work and it was no longer free but subscription based! I AM DONE WITH THE MONEY GAMES BASED ON PRINCIPAL! THEY ARE ALL TRYING TO CASH IN ON THIS CRAP THAT AMOUNTS TO NOTHING MORE THAN A GIMMICK A AI TREND IF YOU WILL!!! BANKRUPT THESE PEOPLE!
Thanks for this Matthew! Super helpful overview.
Wow this is amazing, thank you.
Thank you for the helpful tutorial.
branching chat would be a dope chrome plugin
best video so far, thank you for this
Fantastic, I was just playing with this today. How would you integrate AnythingLLM with these different models running in parallel and interacting through crewAI?
Man, i like your videos. Very informative.
Considering what I was going through to install models previously, this is dumbfoundingly simple.
LM Studio really useful
Agents working locally!!❤
Throttle: genius! LM-Studio got definitely improved. Gorgeous video!
I can’t believe how fast it’s running, fully offloaded, on my GPU (OC 4090). I’m using Mistral Claud merged 7B Q8.
Your new test for LLMs should be to ask them if they can write a piece of python that uses the openai api correctly. GPT4 even with uploading the correct docs fails. Claude fails even when updating with current api docs. crazy
They should get a visual node setup like comfy UI but for creating purely llm based apps.
Some generate button or whatever spits out the python code (or just runs it)
Not that it's hard to write, but some might prefer the node setup.
fantastic, really great video
If LMStudio had a built in RAG it would be perfection.
Nice video as always 👍
So the big question is, When are they going to start charging you money to use LMStudio ?????? You got the best content as usual !!!!!!!!!!
LM studios and their model server is soooo easy.
im hoping they add combining and fine tuning functions or at least image/other gen models in future.
Yes pleeeease do a video using AutoAgent tutorial. Thanks.
I'd love to see videos on running specific models.
I've been able to run almost every model I've downloaded but I can't seem to get StarCoder or StarCoder2 regardless of the preset I use and I've love to get it running without hallucinations, or looping the same sentence.
Another thing I'd love to know is what happened to TheBloke!! I heard he stopped converting models and I'm sure the reason behind has to be epic.
Great video thanks! Any known good models I can use for infosec & or application security (pen testing)?
Hey Matt, been following you for a long time awesome work. Does LM Studio allow different models to exist on separate GPU's
Like if you have multiple GPUs? I don’t think so
A model for software development would be interesting.
I am dropping this comment in case if someone is using this on Linux and have a old machine with built in GPU, I was facing this error where it says: "unable to load any model". A quick solution to that would be unchecking GPU acceleration. Hope this helps!
Wow finally an UI with documentation? I always hated that they throw all that GGUF, GPTQ, 4Q 5Q _K_M_S at you without ever telling what it means and what it needs to run.
I don't think that's how the compatibility filter works. I checked for a couple models that are in a repo with tensor files. They did not show up if that filter was active. I guess they include "does not work on LM Studio" as being not compatible with your system. Because it isn't compatible with their tool? At first I thought that it's not true that every model on HF could be found on there. But for those that were missing, I then saw that they show up if that filter is off. Of course, this is a sample size of trying 3 models or something. May be outliers. But do any models that are not GGUF work on LM Studio?
Hey, can you share some advice for an undergrad to get into generative models, where to begin or rather what should i learn to understanding the working of LLMs and playing around with them?
I've used LM Studio (and also Jan) with some fairly large models, but...I haven't seen an explanation on how to set up multiple languages (and alphabets if necessary). That should be easy, geeks please let me know. The second question is: where I can find an app / code whatever so the interface is not text-based, but includes speech, both listening and speaking (again, in a few languages), or what I really want-- a streaming avatar (like at HeyGen or D-ID or others) which will listen and then more or less instantly move lips and face, syncing with a spoken answer in the same language as the question. I ask because I want to do this all locally--it's for education and we want to keep it cheap and off line, so middle school students can't go "elsewhere". Suggestions?
I have to add that I've created "Assistants" at Hugging Face that can do multiple languages, but it is still text-based. And on line.
LM studio is a nice project. The only complaint I have is that it is not open source.
People can use LM Studio along with lollms as it can be run as a server and they seem to get very good output. So yeah, this is a very cool and useful tool.
2 questions: (1) do these models run offline (good for private data?); and (2) is LMStudio better than ollama?
Matthew, you should conduct a Mac vs PC Local LLM comparisons. M2,M3, RTX 4090. MBP vs Studio.
Is there any way to use all of the gaming PCs in my house to distribute the GPU load? Or, if I would like to use a cloud service for overhead, is there a way to run all this locally, but use cloud GPU when needed?
Can you install Grok from the download provided by Elon Musk? Or do you always have to use their dl link?
you want to double down on this theme and ride it for the next few years or five - local, uncensored, real time, able to scan a wide variety of file formats, p2p sharing of models and training data plus no token limits or api calls - the way to go. maybe you can even delve into some open source llm mesh networking or something - i think that once the hw/sw matures and declines in price a bunch and things like cxl become common it is really going to light a fire under smb sector - lots of innovation, increased productivity and discovery.
I run anythingllm on top of this. It has a nice rag setup.
Great video, Matt.
🎯 Key Takeaways for quick navigation:
00:00 *🌐 LM Studio simplifies running open-source large language models locally without coding knowledge, featuring a user-friendly interface and enhanced functionality.*
00:27 *🔄 The new update allows running multiple models simultaneously, making the process straightforward.*
00:54 *💻 LM Studio supports Mac, Windows, and Linux platforms, facilitating easy download and installation.*
01:08 *🔍 The software integrates with Hugging Face to search and download various large language models.*
01:35 *🤖 Features tools to predict if a model will fit on your computer, enhancing usability for newcomers.*
02:13 *📚 LM Studio includes built-in detailed documentation to assist users in understanding and utilizing different model quantizations.*
03:09 *📞 Introduces an AI chat tab, similar to ChatGPT, that uses local models, offering an engaging user interface.*
04:20 *🌱 Includes new features such as Branch conversation for testing and quality assurance.*
05:15 *⚡ Demonstrates significant speed improvements using built-in hardware acceleration like Apple's Metal.*
05:41 *🎮 The "Playground" section allows loading and managing multiple models simultaneously to enhance outputs.*
07:45 *💾 Introduces JSON mode to better integrate model outputs into applications, promoting ease of API utilization.*
10:17 *🚀 Shows how to run models simultaneously without sequential limitations, leveraging powerful computer capabilities.*
11:55 *🌐 Offers API capabilities to manage and execute model completions, extending usability for developers.*
13:14 *🤝 Highlights collaboration with other platforms like Autogen and potentially Crew AI, indicating LM Studio's flexibility in multi-agent environments.*
Made with HARPA AI
LMStudio is cool.....Please, make more videos on this tool....
Ironically, LMStudio itself is not open source. The AppImage of LMStudio works perfectly on Debian.
Amazing to me how little RAM the models use.
Time for me to get a maxed M3 Max MBP I guess, although gonna wait till after the May 7 event JIC. (I know very unlikely to have any MBP hardware impact, but I’m cautious 🙂)
Doesn't work on my Pi5 but Ollama does. They only seem to use memory when answering a prompt, so I can multiple version of Ollama running with different models, as long as I only prompt one at a time. AMD Ryzen AI and Intel Core Ultra have NPUs onboard now so no need for big GPU card.
I want to build an agent team. At first i transcribe an audio file with whisper large v3 , the next agent can do a PowerPoint présentation. The next one can whrite a macro for PowerPoint an the next one can implémente it an generate a full PowerPoint présentation with template i have selected ! What do you think ?
Mathew, it would be great if there was a reference listing of all your videos so we could just pick a topic and find your videos that apply so we can watch them, again..... I think you did a video with agents before that did something simular, just wish they were easier to find
Does it support exl2 quants? If not that would be the only thing imssing to make the switch from textgen web ui
ty. great video
Thanks for this video. I love this development setup. How do you properly serve it all from a cloud?
Great video
any idea of how I could share the models of ollama with this so I don't have duplicated models?
also, for some reason, when you ask for a joke, it usually tells you the same joke... weird.. any idea of why?
LMStudio has an option in its settings to specify a folder to download the models to. I haven't tried it myself, but I would assume simply pointing LMStudio to the same folder you've been downloading to for ollama would be all you need to do as the models are a standard format from Huggingface.
I wish LM studio would recognize symbolic links or junctions, I have a lot o models and they don't fit in one drive anymore.
How come the Nolan Arbaugh feed into GROK is not accessible? If my local setup can train off human BCI with how GROK trains then my local AI will be smarter-er.
👌"I have learned a lot."
6:58 does anyone know why LM studio uses VRAM instead of RAM in my case? LM studio says that it only supports full GPU offload, but Matthew is using RAM.
Really excellent videos on LMStudio. Does it have the capability to access local files to update chats with data newer than the cut-off date of the LLM? I'd like to be able to input locally stored ebooks and generate summaries along the lines of "What is 'This Book' about?"
Yes please, can you build a content series based in historical fun facts/events, and use any vision image or video generation model, and wisper or any free open-source, text to audio generation. And if possible train Ai agents to run a shorts video serious, having content creator agents, and video editor agent. To create at least 5 video content about historical fun fact series. #AitellsFunnyHistory .
Doctor doom AI in the thumbnail
Excellent video. Tqvm! Is there any video considering various option of hardwares for running local? Say various Nvidia GPUS, or AMD Rocm or even apple metal ?
It also integrates with langchain and llamaindex
Would be great if LMStudio supported Faraday characters. Faraday I find nicer for casual use as LM Studio is more heavy duty. Maybe LM Studio could have a "character mode" or something that was more like Faraday.
how can i fix this Error: A JavaScript error occurred in the main process
Uncaught Exception:
Error: The specified module could not be found: \\?\C:\Users\Administrator\AppData\Local\LM-Studio\...\liblmstudio_bindings_clblast.node
at process.func [as dlopen] (node:electron/jsrc/asar_bundle::ATT)
at Module._extensions..node (node:internal/modules/cjs/loader:17:1A) at Object.func [as .node] (node:electron/jsTc/asar_bundle:T:ATT)
at Module.load (node:internal/modules/cjs/loader:1.97:PT)
at Module._load (node:internal/modules/cjs/loader: PV:IT)
at f._load (node:electron/jsrc/asar_bundle::irrr.)
at Module.require (node:internal/modules/cjs/loader::19) at require (node:internal/modules/cjs/helpers:1+1:1^) at 97459
at r
(C:\Users\Administrator\AppData\Local\LM-Studio\app-,,I^
esources\app\.webpa...:{07) (C:\Users\Administrator\AppData\Local\LM-Studio\app-,,IA
esources\app\.web....:V••V)
There are soo many of these now though...
This one.
Jan.
GPT Everywhere.
(I can't recall from memory any others but I have tested at least 2 more in the last month and they ALL run around the same.)
I'm especially interested in the ability to generate multiple responses from the same model then selecting the best one. Can LM Studio do that at this time?
Hmm so which is better for a newbie VS coder dev, LM studio or Ollama
Can you make them discuss something together?
Model 1 prompt: Discuss ethics with model 2 and agree with everything model 2 says.
Model 2 prompt: Discuss ethics with model 1 and disagree with everything model 1 says.
You need to do a troubleshooting shoot video when the download folder gets moved it ruins everything and I cant seem to line realign path
Run ANY Open-Source LLM Locally i new this would be click-bate but i had to check
This is awesome! Thanks for sharing again!! I have quick question... I want to run this on server like you showed, create a nocode app using APIs, and have users access this application. Kinda creating for a local group of users. How do you think this is going to work in terms of machine requirements. Please guide if it's good approach! 🙏🏻
Would that version be good for a professional translator from English to Spanish?
Can we get some summary of how this improves over ollama? Like i can import ollama and integrate it in my code but how about this?
...I am looking for a way to install LM studio on an external hard drive to have all the models on the external hard drive...I am trying to find a solution to do so also with Automatic 1111 for stable diffusions model....it would be great to have all portable!!!....to keep main hard drive resources free and to keep the models around...if it is possible a video abouti it should be GREAT!
Matt, can you please tell the creators of LM Studio to include the ability to make the chat font larger.