1:30 it's a similar problem with LateX(latechi). But, in linguistics you cannot use different alphabets to write the same word. So I pronounce latex as the material.
since you are not in the Ollama dev anymore, dunno if this ethical to ask... could you cover something about llamafiles? is it really what that said faster Inference with cpu using their llamafiles instead of default llama.cpp
Hey Matt, I was really excited for this video, then only to realize theres no mention of Ollama in it!?!? I have a project on the go where I'm trying to build a multi container app using docker-compose where the containers are backend- fastAPI, frontend- Nextjs and llmServer- Ollama. I'm running into problems having the backend connect to the Ollama server ... I get the dreaded [Errno 111] Connection refused
@@technovangelist wow, thanks for responding Matt, much respect for what you're doing! considering your comment "... I wouldn’t use ollama in docker ..." might I be so bold as to ask... if you were me, and you needed to host this app on Azure (which I do) how would you go about hosting Ollama?
Got it. That makes sense. Docker on a host vs docker on localhost can be different. If you are running their container service rather than an instance then that makes sense. Have you had success with getting access to a real gpu? Last time I tried I could only get their generic named cards and not a real amd or nvidia card.
@@technovangelist hi again Matt and thanks for the continuing engagement! to answer your question, Yes we have had success in getting a vm with a gpu, there are some "N series" options on Azure for us mere mortal, e.g. NV4as_v4 which is a 4 core 14gb server with an AMD gpu (this is the smallest of 3 in this series and costs $170p/m). I've stood up an ollama server with one of these and I can test connection to it over the internet successfully, in my app I have the baseURL of the ollama server setup as an env, so I can swap it out... but when I do I get connection issues :( interestingly yesterday I also setup a serverless endpoint on Azure for a llama2 model and ran into the same problems, so this means the issue might be totally unrelated to ollama!?!
Such a calm, yet punctual voice.
Like ASMR + tutorials.
And the content is easy to follow and understand.
Great Channel!
Yes some Docker networking content would be most welcome
Great overview and great job at keeping this simple for people.
1:30 it's a similar problem with LateX(latechi). But, in linguistics you cannot use different alphabets to write the same word. So I pronounce latex as the material.
Yes, container networking. Podman would be awesome to cover too. We’ve all seen the “Hitler uses Docker video”.
Hi @Matt Williams if time permits, can you create a video showing how to deploy Dify to a cloud service like Render?
since you are not in the Ollama dev anymore, dunno if this ethical to ask... could you cover something about llamafiles? is it really what that said faster Inference with cpu using their llamafiles instead of default llama.cpp
Great video
Great ❤
Hey Matt, I was really excited for this video, then only to realize theres no mention of Ollama in it!?!?
I have a project on the go where I'm trying to build a multi container app using docker-compose where the containers are backend- fastAPI, frontend- Nextjs and llmServer- Ollama. I'm running into problems having the backend connect to the Ollama server ... I get the dreaded [Errno 111] Connection refused
This was 100% about using tools with ollama and docker.
Ahh. I see how you can think that. But I wouldn’t use ollama in docker anyway. This is just about the UIs.
@@technovangelist wow, thanks for responding Matt, much respect for what you're doing! considering your comment "... I wouldn’t use ollama in docker ..." might I be so bold as to ask... if you were me, and you needed to host this app on Azure (which I do) how would you go about hosting Ollama?
Got it. That makes sense. Docker on a host vs docker on localhost can be different. If you are running their container service rather than an instance then that makes sense. Have you had success with getting access to a real gpu? Last time I tried I could only get their generic named cards and not a real amd or nvidia card.
@@technovangelist hi again Matt and thanks for the continuing engagement! to answer your question, Yes we have had success in getting a vm with a gpu, there are some "N series" options on Azure for us mere mortal, e.g. NV4as_v4 which is a 4 core 14gb server with an AMD gpu (this is the smallest of 3 in this series and costs $170p/m).
I've stood up an ollama server with one of these and I can test connection to it over the internet successfully, in my app I have the baseURL of the ollama server setup as an env, so I can swap it out... but when I do I get connection issues :( interestingly yesterday I also setup a serverless endpoint on Azure for a llama2 model and ran into the same problems, so this means the issue might be totally unrelated to ollama!?!
Naming things is hard 😂
It's not pronounced "en-gin-ex", it's pronounced "en-gin-chi".
That is definitely en gin ex. Has been since it started out in Russia. I have spoken many years at nginx conf.
@@technovangelist I think it was probably a joke
@@technovangelist I know haha, I was being facetious XD.
Ah yes, Chinese President “KAI JinPing”
i have no idea what this comment means