do yourself and your users a favor. mistral-nemo it's only 4b and i think it will execute on a phone or tablet? maybe i am wrong. its the best small model. i extensively used it before switching to mistral-small for local.
You forgot to mention that the model is 45 gb which doesnt even fit on the post expensive customer gpu, meaning it will offload to ram and be incredibly slow. So you have to use quantized version which isnt better than gpt 4o, and you still need a very good computer for it.
if he is using the default ollama's llama 3.3 command, then its quantized model of llama3.3 70b Q4_K_M, which is around 43GB in size, so his mac definitely has to have more unified memory than 43GB.
What do you think about the Microsoft ceo saying Saas will not exist. We can build whatever we want with a few prompts and as soon as the context is large enough, why would we need all these apps?
@@WesTheWizard there were 2 comments guess the first got deleted I suggested using amallo for Android then added another comment on how I use it My best guess is because amallo isn't a playstore app
man, using AI locally would mean being able to chat with it offline, which in your case isnt the thing as you need to be connected to network and use api. same thing as if you used gpt app, genius 🤣
This comes across as an unserious video, given that you talk nothing about the minimum requirements, making a lot of people waste their time trying to run something that won't work on their computers.
🚀 Work 30% faster with Vectal: www.vectal.ai/
🤖 Wanna start a business with AI Agents? Go here: www.skool.com/new-society
do yourself and your users a favor.
mistral-nemo
it's only 4b and i think it will execute on a phone or tablet? maybe i am wrong. its the best small model. i extensively used it before switching to mistral-small for local.
You forgot to mention that the model is 45 gb which doesnt even fit on the post expensive customer gpu, meaning it will offload to ram and be incredibly slow. So you have to use quantized version which isnt better than gpt 4o, and you still need a very good computer for it.
Would you do a video on how to get an open source “advanced voice mode” working on our phones? 🙏
lol 😂
Though you're gonna run llm on a phone, but its just using api🗿
For now perhaps. How many phones are already close to having 64GB of RAM?
@pdhowler but it is not private if it's using an API...
I did that locally. Check it out.
@@timbarnette5545pretty sure you can run a private api locally and still connect to the private llm via that smartphone if youre on the same network…
It's private since the API is running in his own computer
What are the computer specs needed to run llama 3.3 locally?
Same question
if he is using the default ollama's llama 3.3 command, then its quantized model of llama3.3 70b Q4_K_M, which is around 43GB in size,
so his mac definitely has to have more unified memory than 43GB.
@@naeemulhoque1777 then he is possibly using Mac studio
@@naeemulhoque1777isn't the issue in question the ram usage not necessarily the memory? Think ChatGPT will know lol just a joke.
What are the specs of your macbook?
Excellent commentary and content David. Thanks for sharing your knowledge!
Just waiting to the uncensored version.
Well done man. Very cool and useful.
Awesome tutorial. Great channel.
That app seems very helpful. Escpecially considering that someday soon >10B parameter models may become much more capable than they are today.
What are the specs needed? It says for me not enough memory
Hey Ondrej, iPhone Mirroring exists 😬 Is it possible to record the screen while iPhone Mirroring? Maybe a better setup, no?
Great instructions. Please make the video with SSL in the API. Thank you.
What version have you downloaded ?? Mac os / linux / windows?
Regards
David - which macbook r u using? i'm on the macbook pro max1
What do you think about the Microsoft ceo saying Saas will not exist. We can build whatever we want with a few prompts and as soon as the context is large enough, why would we need all these apps?
If you want to put your faith in that dude, go hard....😂
Yup that’s the future of software. A database and ai agents on top. 😊already building a poc for real estate
Enchanted is free on iOS
Is there a good vision language model that is also uncensored?
llama3.2 vison is extremely censored.
Dolphin
love this
requesting a tutorial for the latest tech ai image generation as well! :)
What are the options for Android?
I'm not an apple guy.
I've been using an app called amallo, it doesn't seem to be on the Play store but I found it on apkpure, haven't had any issues with it so far
as an added bonus I have cloudflare tunneling enabled on the computer I run ollama on so it works outside of my home Wi-Fi as well
@@Jordan-9595 What does that have to do with his question?
@@WesTheWizard there were 2 comments guess the first got deleted I suggested using amallo for Android then added another comment on how I use it
My best guess is because amallo isn't a playstore app
Someone has to say - it is reasonably open source not completely.
6:47 😅😅😅
It took 12 hours to download 😅
i think ur accent is improving
man, using AI locally would mean being able to chat with it offline, which in your case isnt the thing as you need to be connected to network and use api.
same thing as if you used gpt app, genius 🤣
Wait.... Don't share your IP... But paste it here... Into this app.😂😂😂
thanks buddy
Mahh
Interesting
This comes across as an unserious video, given that you talk nothing about the minimum requirements, making a lot of people waste their time trying to run something that won't work on their computers.
Vectal ai every time promotion lol , instead we can use claude web interface or build our own 😂