Matt thanks for great ple content. I have a 2016 i7 32Gb ram and 6GB 1070Ti laptop I can run 13b and 27b models easily. It’s great platform! Please do crash course on templates
Your videos are a gem. Thank you. Just a suggestion for a topic: Resource management. I don't understand how, in a multi-tier system with dedicated servers, there is such a difference in memory allocated for ollama when operating from curl, open webui, or membpt/letta. How can I tune what the client reserves on the ollama server?
Thank you so much Matt! As always everything is relevant, clear and interesting! I have several questions for you: 1. How do I know what information the model was trained on? What skills does it contain? I have a weak computer, so I use small models. If I know what information was put into the model, I will understand if I should use it for my purposes. 2. Is there any way to remove unnecessary information from the model, so that I can train this model on my own. I am grateful to you in advance for your professional answers. From the bottom of my heart I wish you the soonest 1 000 000 subscribers, success and prosperity, you are the best !!!
usually the card describing the model says where the data it was trained on comes from. Removing info from a model is very hard and computationally very expensive.
Hi, thanks for all the great videos! I have an unusual issue and hope you can help. I use Ollama for both daily tasks and larger projects. For the bigger models, I’ve moved the files to an external drive due to their size and set the environment path (on Windows), which works well. However, for my daily tasks, it’s inconvenient to always have the external drive connected, especially for using basic models like LLaMA 3.2. Is there a way to set up two model locations so it can read from both when available, or default to the laptop when the external drive isn’t connected? Thanks in advance! 🥛
thanks, matt this seems strangely related to my questions i was asking on discord of what is called "a model" vs what is the GGUF file, because it can somewhat confusing to see the catalog on ollama of models, and see a catalog of models on hugging face, i'm trying sort grasp the notion of a model which makes it look like it's code, even though it's not, and how it's related to model template which is not the same as the system prompt. i understand that model template is somehow used for the creation of a model, but it's the language itself standard, that all tools besides ollama could understand?
All models require a template to use. the general syntax is the same, but some tools will use jinja templates to express that, and ollama uses gotemplates
@@technovangelist if a template includes a system prompt does this mean the system prompt is injected behind the scenes to every prompt (cause system prompts can be modified)
Thanks a lot for the course, Matt. I have a 2020 iMac with an AMD Radeon which doesn’t work with cuda. In your experience, is there a way to use an external graphics card that works with Ollama?
@@technovangelist OK, thanks a lot for taking the time, sir. Time to change my Mac. Would you share which specs are relevant for working with Ollama? Thank you!
I think any apple silicon Mac is amazing. Getting the most memory and dis you can afford is important. With 64gb ram I can do up to a 70b model though I rarely do. Depending on your workload I would pick at least 1tb. I have 4 and it’s great. Though I spend a lot of time offloading stuff. The new Mac’s should be out soon but a used m1 or m2 is great too
@@technovangelist Do you have any opinion about Apple M2 Ultra? I'm considering getting an Apple Mac Studio with M2 Ultra, 64 GB of RAM, and 1TB of SSD.
I wish there is a model move command. The internal model folder can use up my ssd free space. Some of my models are huge and don't want to re-download again. Be nice to do a model move command to offload models I'm not using for now onto external ssd. But convenient to copy back to the internal ssd model folder when needed again. Made a py script that just moves the "big" files. But its not 100% reliable and few times I was forced to re-download the model to get it working again. An official ollama model mover tool would be most useful. keeps all the dependency files organized and working for the target model when moving between volumes ssd model folders.
a better lesson is swap file! the bigger your swap file the bigger the models one can load even on a lil rpi 5 i can run massive models any yer its slower but free :D this is for linux maybe mac aswell i dont have crapple stuff! function swap() { # Set the default size to 20 GB local default_swap_size=20 # Check if an argument is supplied if [ -z "$1" ]; then read -p "Enter Swap Size (GB) [default: $default_swap_size]: " sizeofswap # If no input is provided, use the default size sizeofswap=${sizeofswap:-$default_swap_size} echo "Setting New Swap Size To $sizeofswap GB" else echo "Setting New Swap Size To $1 GB" sizeofswap=$1 fi sudo swapoff /swapfile sudo rm /swapfile sudo fallocate -l "${sizeofswap}G" /swapfile sudo chmod 600 /swapfile sudo mkswap /swapfile sudo swapon /swapfile echo "New Swap Size Of $sizeofswap GB" free -h }
For most this is the first video of mine they have seen. Why should I be listened to to learn about ollama?it’s about credibility. And I am able to see watch time has generally gone up since adding that.
Great content Matt. Keep up the good work!
I have just an old quadro m4000 but ollama works fine. I'm so happy with it.
pure gold Matt!
Matt thanks for great ple content. I have a 2016 i7 32Gb ram and 6GB 1070Ti laptop I can run 13b and 27b models easily. It’s great platform! Please do crash course on templates
I was planning to do that. Thanks
Your videos are a gem. Thank you. Just a suggestion for a topic: Resource management. I don't understand how, in a multi-tier system with dedicated servers, there is such a difference in memory allocated for ollama when operating from curl, open webui, or membpt/letta. How can I tune what the client reserves on the ollama server?
hmmm. there isn't anything different that ollama does. Maybe webui and that other thing do something
what is membpt/letta
Any chance this difference can be accounted for by the inclusion (or lack of) of conversation context history?
Well there is that or the clients have adjusted the model without telling you to the max supported context size of the model.
Thank you so much Matt! As always everything is relevant, clear and interesting! I have several questions for you:
1. How do I know what information the model was trained on? What skills does it contain? I have a weak computer, so I use small models. If I know what information was put into the model, I will understand if I should use it for my purposes.
2. Is there any way to remove unnecessary information from the model, so that I can train this model on my own. I am grateful to you in advance for your professional answers. From the bottom of my heart I wish you the soonest 1 000 000 subscribers, success and prosperity, you are the best !!!
usually the card describing the model says where the data it was trained on comes from.
Removing info from a model is very hard and computationally very expensive.
@@technovangelist Thanks!
Hello Matt, how can I run llama3.2-vision using ollama on postman? I want to send an image as input.
Hi, thanks for all the great videos! I have an unusual issue and hope you can help. I use Ollama for both daily tasks and larger projects. For the bigger models, I’ve moved the files to an external drive due to their size and set the environment path (on Windows), which works well.
However, for my daily tasks, it’s inconvenient to always have the external drive connected, especially for using basic models like LLaMA 3.2. Is there a way to set up two model locations so it can read from both when available, or default to the laptop when the external drive isn’t connected?
Thanks in advance! 🥛
It is all interesting even though I struggle to understand some things. I hope your channel does well going forward.
😊 hi Matt hope you find time to do a chat stream at some point.
thanks, matt this seems strangely related to my questions i was asking on discord of what is called "a model" vs what is the GGUF file, because it can somewhat confusing to see the catalog on ollama of models, and see a catalog of models on hugging face, i'm trying sort grasp the notion of a model which makes it look like it's code, even though it's not, and how it's related to model template which is not the same as the system prompt.
i understand that model template is somehow used for the creation of a model, but it's the language itself standard, that all tools besides ollama could understand?
All models require a template to use. the general syntax is the same, but some tools will use jinja templates to express that, and ollama uses gotemplates
@@technovangelist if a template includes a system prompt does this mean the system prompt is injected behind the scenes to every prompt (cause system prompts can be modified)
Thanks a lot for the course, Matt. I have a 2020 iMac with an AMD Radeon which doesn’t work with cuda. In your experience, is there a way to use an external graphics card that works with Ollama?
Intel Mac won’t be able to access the gpu
@@technovangelist OK, thanks a lot for taking the time, sir. Time to change my Mac. Would you share which specs are relevant for working with Ollama?
Thank you!
I think any apple silicon Mac is amazing. Getting the most memory and dis you can afford is important. With 64gb ram I can do up to a 70b model though I rarely do. Depending on your workload I would pick at least 1tb. I have 4 and it’s great. Though I spend a lot of time offloading stuff. The new Mac’s should be out soon but a used m1 or m2 is great too
@@technovangelist Once again, Matt, thanks for taking the time. I appreciate it a lot, sir. You're the best.
@@technovangelist Do you have any opinion about Apple M2 Ultra? I'm considering getting an Apple Mac Studio with M2 Ultra, 64 GB of RAM, and 1TB of SSD.
I wish there is a model move command. The internal model folder can use up my ssd free space. Some of my models are huge and don't want to re-download again. Be nice to do a model move command to offload models I'm not using for now onto external ssd. But convenient to copy back to the internal ssd model folder when needed again. Made a py script that just moves the "big" files. But its not 100% reliable and few times I was forced to re-download the model to get it working again. An official ollama model mover tool would be most useful. keeps all the dependency files organized and working for the target model when moving between volumes ssd model folders.
a better lesson is swap file! the bigger your swap file the bigger the models one can load even on a lil rpi 5 i can run massive models any yer its slower but free :D
this is for linux maybe mac aswell i dont have crapple stuff!
function swap() {
# Set the default size to 20 GB
local default_swap_size=20
# Check if an argument is supplied
if [ -z "$1" ]; then
read -p "Enter Swap Size (GB) [default: $default_swap_size]: " sizeofswap
# If no input is provided, use the default size
sizeofswap=${sizeofswap:-$default_swap_size}
echo "Setting New Swap Size To $sizeofswap GB"
else
echo "Setting New Swap Size To $1 GB"
sizeofswap=$1
fi
sudo swapoff /swapfile
sudo rm /swapfile
sudo fallocate -l "${sizeofswap}G" /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo "New Swap Size Of $sizeofswap GB"
free -h
}
that’s too bad. Apple gets so much more right....;-)
Why the need to every single video, emphasise that part of Ollama founding team?
For most this is the first video of mine they have seen. Why should I be listened to to learn about ollama?it’s about credibility. And I am able to see watch time has generally gone up since adding that.
That’s also why I add some pics from the group to give the regulars something to see