Hi, I love all your videos. Could please make a video on getting structured output using ollama. I have use-case to extract specific information from the image and get the output so that automatically the data will be added in database. thanks in advance.
impressive this llva! my original plan was to detect objects via yolo7 ..give the detected objects to ollama to get some text..and let this text then sound via a loudspeaker. llva ist detecting much more object i guess!? - thx for your video 🙂
@@sumukhas5418 Thanks for the answer :) Actually I am trying pytesseract to read id-card information, which are photographed by a phone and the results are not very good :/ Do you have some ideas, how I could get some better results?
Is ollama and llava is free to use and I have spec 16GB/1TB RTX 3050Ti what no. of model is suitable for my device 13B one or else. And I already using ollama basic 4GB model in my device is it ok to run 13B model and some Other model like OpenAi or Gemini API??
wow this is too easy to be real. i am using opencv to record videos of flying saucers. i could record images and use llama to verify if there is a flying saucer in it. can i also search videos with videos: instead of images:?
With 4-bit quantization, for LLaVA-1.5-7B, it uses less than 8GB VRAM on a single GPU, typically the 7B model can run with a GPU with less than 24GB memory, and the 13B model requires ~32 GB memory. You can use multiple 24-GB GPUs to run 13B model
Help me out ,it took less than 10 seconds to get the output , but for me it is like taking 3mins to run , of course it runs , i am happy but it is too late
Awesome, man! I was not aware of customizing Ollama with this kind of Python script! Thanks :)
This is quite useful!
It gives me some great ideas for my own local apps!
Hi, I love all your videos. Could please make a video on getting structured output using ollama. I have use-case to extract specific information from the image and get the output so that automatically the data will be added in database.
thanks in advance.
Are there models that recognize a photo and then vectorizes it?
impressive this llva! my original plan was to detect objects via yolo7 ..give the detected objects to ollama to get some text..and let this text then sound via a loudspeaker. llva ist detecting much more object i guess!? - thx for your video 🙂
how to add long term memory in this local llm ???
Thanks for the video, how to make sure that I install Ollama on the GPU not on the CPU?
what gpu ?
how much vram ?
Riding the awesomeness wave again!
How can I modify this code to use my local GPU? It seems to default to my CPU but can't find any way to do this easily
it is using my GPU..i have py39, CUDA 11.2 and cuDNN 8, 2019 Visual Studio, GTX 1660TI “Tuning sm_75”
If my local ram is 8 gb, which ollama model would you recommend to use?
deepseek-coder ❤
deepseek-coder ❤
This was very helpful, my first time getting results from a multimodal LLM directly using Python.
Thanks :) Is it possible to use this model as an ocr alternativ to get for example informationen from a jpeg image which is an id-card ?
This will be too much heavy for just that
Instead considering yolo would be a better option
@@sumukhas5418 Thanks for the answer :) Actually I am trying pytesseract to read id-card information, which are photographed by a phone and the results are not very good :/ Do you have some ideas, how I could get some better results?
Nice, very helpful!
Is it possible to create embeddings of pictures with the model?
Rad video, thanks dude.
Why's the image path take a list, but supplying multiple images to it doesn't work?
is this fully offline? I am not sure you downloaded the 13B 7.4Gb package
Is ollama and llava is free to use and I have spec 16GB/1TB RTX 3050Ti what no. of model is suitable for my device 13B one or else. And I already using ollama basic 4GB model in my device is it ok to run 13B model and some Other model like OpenAi or Gemini API??
can we get the answer in different languages as per the client requrement just like in hindi or tamil or japanese etc if possible
Thanks for your help you legend
wow this is too easy to be real. i am using opencv to record videos of flying saucers. i could record images and use llama to verify if there is a flying saucer in it. can i also search videos with videos: instead of images:?
What a nice vid. Can I do a ai without using open ai ?
rag - webcam - selfawareness - speech --> tutorial pls
How much RAM and VRAM needed ?!
With 4-bit quantization, for LLaVA-1.5-7B, it uses less than 8GB VRAM on a single GPU, typically the 7B model can run with a GPU with less than 24GB memory, and the 13B model requires ~32 GB memory. You can use multiple 24-GB GPUs to run 13B model
Help me out ,it took less than 10 seconds to get the output , but for me it is like taking 3mins to run , of course it runs , i am happy but it is too late
My computer takes more than an hour , the system is installed with a 4GB 3060 GPU , what can I do
@@santhosh-j7e I dont know man , i was like working it for my hackathon , i tried like all pc ,like pentium , i3 , i5 ,i7 but no difference.
7.5 Gb ?????
It's 4.7gb for 7b version
oh im too fast
First comment 😊😊😊