I'm truly impressed by your explanation. As a complete beginner in this field, I found your ideas very easy to understand. You deserve a larger audience and more support. I'm grateful for experts like you who can break down complex topics and make learning accessible for newcomers like myself.
I truly appreciate the time Matt took to provide a comprehensive and detailed explanation of the function/tool calling processes. His technical explanations were spot on and completely resonated with me. I found his breakdown to be clear and insightful, making it easy to grasp the concepts involved.
Love your videos! Your last vid about function calling really cleared some things up. I used that knowledge to create a market research bot for my company! They loved it, now I've jumped from a frontend typescript tev to AI operations engineer
@technovangelist I would be stoked to see a "top 5 ollama models for different tasks" style video. I'm just using Llama 3 for everything right now. Some tasks I would like to optimize for speed, others for depth.
You are correct! Function calling is actually made possible simply by the reasoning capacity, such that it is, of the model. There is nothing more than that. It is a convenient abstraction for service interactions. Instead of function calling we could just call it "if you think you need it you may ask for the following ...". BTW, this type of process reasoning is also used for agentic interactions when deciding workflows.
I love your videos, man. This is one big thing I’ve picked up along my journey: Always start from specific information or ideas, and go towards generalities (not the other way around). In a classroom setting, where people prepare their wallets and minds for a learning event, maybe starting with generalities is better. But in real life, there are so many distractions. Distractions lead to confusion. Confusion leads to annoyance and madness. Moderate specificity requires the least amount of a consumer’s time and attention to get started. Maximizes potential engagement. Minimizes annoyance or friction. Am always telling support or client-facing folks this exact same thing when they ask wide open questions, and then complain that devs are frustrated or taking a long time to respond. Or maybe the dev team is falling behind on a new feature (of course they will when always having to context switch due to wide open query or too many unresolved details). Goes back to starting from specific details, so the consumer remains locked into their original intent and context, as much as possible. Not everyone who clicks here and watches these videos is watching with pen and paper in hand taking notes for a college exam. In order to potentiate their involvement, specific repeatable easy details are key.
Sorry I don’t understand that response. Also I updated/clarified the message during your response. Not trying to preach. Just my observations of life applied to what was discussed.
OK, well, if there is something that you don’t understand about it, can you point out the spot that is not understood? That’s kind of my case in point.
Good stuff. And yes, I got agent-like behavior without the framework by prompting only. And yes, no need for a specific model, or fine-tuning. prompt before the prompt works well.
i took your original example and implemented it without a problem. people are just so dense. I've been using it on my project since the original video without a problem.
Timely. I have been battling getting function calling to wotk right. Sadly, many of the examples out there don't work with different models, they seem to all assume OpenAI. I look forward to giving your approach a try!
I think the biggest problem for you, is that most people will read something in docs or on a blog and then claim to understand. Then attack someone even though they dont dont actually understand what they talking about. You only understand once you implement and use the functionality. And to your 2nd point. You 100% correct theirs no reason for a model to be finetuned for function calling, i discovered function calling with gpt3.5 about 3 months after chatgpts launch.
i came to this exact same conclusion. i was going crazy trying to get openai to follow an exact line-oriented function calling format (not json). in some ways, function calling would not exist at all; if it was not so hard to make the LLM just follow the format requested by the system prompt.
I was writing a form-filling bot, that switches to your language, and interrogates you until the form is completely filled out. When it responds, it is asked in system prompt to reply with a per-line format. Any number of responses that we will parse: SAY {message} SET {fieldName} {fieldValue} SAVE In response to "My name is Robert Jason Fielding" it could respond like: SAY Thanks for your full name! SET firstName Robert SET middleName Jason SET lastName Fielding SAVE SAY What is your social security number? It is kind of tricky to make it obey the system prompt. And this is far far simpler than the openai function calling APIs. The basic idea is to wrap a database in a layer of language common-sense.
Awesome! I am glad there are people like you to simplify and RE-explain the basics to the "writers". ☺ I really appreciate you coming and stepping on the trolls' feet. Perfect 👌 I see no reason to get excited about incompetent comments. Just chill and explain nicely 👏
Hi Matt, I watched again this video with pleasure, and it got me thinking again :-). First of all, please avoid the trap of dividing your followers into lovers and haters. You produce top-notch content, and there's no need to apologize or dramatize (appreciating you irony). Let me delve into a point that emerged in your demo/experiment, which is, in my opinion, more significant than the function calling "issue". You verified that the majority of open-source models available on Ollama are able to produce the expected JSON. That's somewhat surprising to me. This demonstrates, as you suggested, that the OpenAI function-calling fine-tuned models are just marketing, but wait. I remember that the old GPT-3.5 OpenAI "instruct" models like "text-davinci-003" were able to produce JSON (so function calling-JSON if you will), but subsequent chat COMPLETION models (fine-tuned for lists of system/assistant/user messages) weren't! So, my guess is that OpenAI released the function-calling fine-tuned models later to correct the chat-completion fine-tuning?! Ironic again. But, back to the Ollama models-I'm still perplexed. Are these optimized for both (at the same run-time) CHAT completion and "function calling" (aka JSON outputs)? This could maybe be a topic for another video...? By the way, it would be kind of you to share the code on your GitHub repo as usual, but anyway the video is absolutely explanatory. I'll take a look at the "Tools" #5284 Ollama PR. In my opinion, standardization could help the community around Ollama, even if you demonstrated that any user-made schema does the job. Thanks always for sharing great content. I appreciate your effort. Chapeau Giorgio
When it comes to function calling with lots of highly specific parameters I find that the models that I can run in ollama are simply uncapable of following the schema i provide. Whereas openai and claude do an excellent job following a large json schema. So when it comes to a mod being "trained for function calling" I think they mean trained to follow large and strict schemas well as thats really the main difference. You will see a difference if your function call (basically json output) needs to follow a large and strict schema. Everyones examples are too small to notice the change.
Have you tried. I just tweaked my code to use 10 parameters per function. And gave a more complicated prompt. Worked just fine. But if you are doing that you probably have bigger issues outside of the model.
@@technovangelist 'Disconnected Engagement'. Stay fully engaged with what you are working on and your goals but disconnected from trolls, detractors and negative feedback. All the best with your channel.
Got it. Thanks. Luckily the negative is a small fraction of the rest of the comments. And I don’t spend too much time on it. I had fun with this one though. Thanks for the comment.
You make it sound like reading my comments is something I would want to avoid. Ideas come from comments. Connection comes from comments. This would be the last thing I would ever want to outsource to an ai or other human.
The libraries are creating abstractions over a document and we can forget where the abstraction layer ends and where the substrate begins. I'm going to try out this pattern. Lots of libraries make it easy to swap out models expecting completions vs conversations... fewer libraries have a nice clean way to swap out models that handle function calling differently.
that's amazingly simple, nice. I'm guessing one scenario where someone would still want an agent-framework is if the framework was a low/no-code workflow. I'd love to see a video on whether running models with GPTQ quantization is worthwhile. Most explanations I've seen amount to "GPTQ is for GPUs, GGML is for cpus" without saying why GPTQ is completely neglected in projects like ollama, or if there is even a meaningful advantage to either at this point.
Would you mind answering a question? With you use this do we essentially pass in the instructions each time we prompt? For example if i tried to do this manually, i would just repeat my instructions each time (which also include the formatting) along with the new question? I wasnt sure what system prompt vs user prompt means, do they both end up in the same place anyway
I've been playing with AI and tools for quite long and I just got the same opinion as you. It's not complex at all but it does bring a lot to the table. Also, agents frameworks are not needed for most use cases.
As you grow in popularity, you may experience that your closest supporters will apply the greatest scrutiny. It doesn't mean you're disliked, no matter the perception of tone.
So glad you made this video! Could you perhaps go into why apps like the ones that assist with coding or app creation that use function calling may fail with local models, but work seamlessly with the cloud models? I think this is an area where people are struggling based on the many issues I see in Github repos.
I think a lot of folks don’t realize that function calling is possible in ollama. There are folks who seem intent on spreading the notion that function calling is more than it is. And so they kind of brute force their way through rather than taking the simpler approach. But that’s just a guess. Can you point me to some of the issues you have seen?
Considering openAI as the ONLY solution is not smart. In a lot of usecases you can get away with opensource models like llama3, mixtral, deepseek etc. And try not to blame Ollama its just a library to run quantized open source model locally, and give you a API interface just like OpenAI 😆
Thanks for your videos and demo code Matt... very helpful. And sorry that some people are nasty and hateful. There os no need for that. It is sad that some people feel they have permission to vent their anger and negativity and harm others.
i think that ollama implementation of function calling is by forcing `{` tokens at the starting to force the model to generate function call. correct me if i am wrong.
Can I say that, if my prompt is clear enough, I can have function calling using any module? Since it’s just helping the software to decide which functions to call, right? Thanks for the explanation, it’s mind blowing to me
Matt - haters suck. You’re doing great and it’s awesome that you’re willing to share your wisdom and knowledge. Please ignore the jerks, we’re surrounded by ass holes.
I dont use OpenAI function calling at all - it's just a wrapper for JSON conversion and interpretation of the output, and I'd rather keep control of that myself to make it more portable between LLMs. Why would anyone write something that is locked to a LLM interface definition when we live in such a turbulent world. I'd encourage everyone to do the same. I cant honestly see any benefit in using the "function call" feature versus rolling your own.
@@mattgscox bot gonna lie. I end up finding myself in your position. And realize OpenAPI specs is the answer to it. I don't want to give away my digital independence.
funny, but also three minutes of my life...ill go back to watching the original. i was half way to trying it, when some dilwad distracted me from that excellent content. update: The code eventually returned 200 success in an empty message. If i ask on the ollama interface, I do get Berlin. Probably an ollama update.
😆I felt like this was a dev version of this vid: th-cam.com/video/0Szj21arytU/w-d-xo.html. Really enjoyed this one. As always, thanks for putting out great and useful content!
actually that bit was interesting to see that every single model produced not just the correct output but the right out put.... personal;y i have found that using such techniques means after you get your final response you will need to unload and rerload the model or clear the cache so the model can prepare for the next question ?
If you are having to unload and reload there must be something very strange with your setup. Is this with ollama? Have you updated to the latest versions? There is no need to do such things.
most of youtuber doesn't care about what their viewer agrees or disagree (and how they said it) but you handle it as if they are part of a....? community... ? a.... companion along ollama adventures... ? in the first place most youtuber doesn't care and move on to next video at the end those nasty words are just a comments and when newer videos come up those disagree-er would come back to watch newer video.... it also happen with wes roth channel, david saphiro channel, even Kamph channel.... well anyway, I've been watching TH-cam unreasonable long, i don't have local TV or Netflix all i got are smart tvs, android boxes, tablets all over my places, at the office at my room at my home at my car everywhere they all mostly 24 hour playing TH-cam videos. and matt you're the only one after all these years a youtuber whom really care and serious about what you said and the recent event was surprisingly handled in deferent degrees. you're treating your channel in different way it is interesting way of youtube-ing don't stop matt, unless you got private issue.
I think that is part of my background as an evangelist or as some companies call it, a dev advocate, though that’s a misleading name. Build a community, have conversations, relay feedback back to the team. I have incorporated so much feedback into my videos every time. Thanks for the comment and thanks for noticing.
yeah function calling is just making llm to choose what function to use and specifying the required param as structured output. I am amazed how dumb people are ... just try to code up simple example and run it.
no not dumb .... they are many components to an ai system you can just use inputs and outputs ... but there is alot more you can do with a base model ! as we amy see a tutorial or example of your Mistral model , flying your RC helecopter !
Sooo, that was exactly what I wrote on your other video? That you can use ANY model for this, as long as it returns json in the response. As it's the "calling party" that actually runs the code/function, it has nothing to do with the model itself. But you claimed this was "added" to later models? I am confused.
Hmm not sure what other video you are referring to. But if I said something that sounded like I suggested it was added in the model I was simply not stating what I meant clearly. Function calling was added to ollama in October or November. So later than the initial release in June. That’s what I would have meant.
@@technovangelist Ok, you wrote that it was added in Llama 2, which is a model. If you meant Ollama, it makes more sense. However, what exactly prevented me from doing this with the very first version of ollama? As long as I make my own scripts that talks directly to the ollama API, why would I not be able to "ask it to return json" and simply run functions in my script based on the response? That is the part that I still do not get. Why would any type of "support for function calling" need to be added to either the model or the "wrapper" (ollama in this case) for it to work?
If you did that at the beginning the answer would have probably been something like: “sure, here is the json: {…”. It wouldn’t have been just the json. Folks were adding instructions like no prose etc to get the model to follow the instructions
@@technovangelist That's odd. It worked perfectly fine for me to say "only respond with a json object, nothing else" even on the very first models. Anyways, doesn't really matter.
I'm truly impressed by your explanation. As a complete beginner in this field, I found your ideas very easy to understand. You deserve a larger audience and more support. I'm grateful for experts like you who can break down complex topics and make learning accessible for newcomers like myself.
The way you explain thinks..... Is soooo pedagogical . The tone and the voice nuance ..musical in the ears 😊 .
Thanks
best content creator at explaining complex things simply
I truly appreciate the time Matt took to provide a comprehensive and detailed explanation of the function/tool calling processes. His technical explanations were spot on and completely resonated with me. I found his breakdown to be clear and insightful, making it easy to grasp the concepts involved.
Love your videos! Your last vid about function calling really cleared some things up.
I used that knowledge to create a market research bot for my company! They loved it, now I've jumped from a frontend typescript tev to AI operations engineer
Nice. hope that came with a bit of a pay bump.....let me know the next thing you need and I can try to cover that too.
@technovangelist I would be stoked to see a "top 5 ollama models for different tasks" style video. I'm just using Llama 3 for everything right now.
Some tasks I would like to optimize for speed, others for depth.
You are correct! Function calling is actually made possible simply by the reasoning capacity, such that it is, of the model. There is nothing more than that. It is a convenient abstraction for service interactions. Instead of function calling we could just call it "if you think you need it you may ask for the following ...". BTW, this type of process reasoning is also used for agentic interactions when deciding workflows.
if your model can write code then it can call a function !
The way you explained it before is way more robust than how most frameworks/providers accomplish things with a tool use abstraction.
I love your videos, man. This is one big thing I’ve picked up along my journey:
Always start from specific information or ideas, and go towards generalities (not the other way around).
In a classroom setting, where people prepare their wallets and minds for a learning event, maybe starting with generalities is better. But in real life, there are so many distractions. Distractions lead to confusion. Confusion leads to annoyance and madness.
Moderate specificity requires the least amount of a consumer’s time and attention to get started. Maximizes potential engagement. Minimizes annoyance or friction.
Am always telling support or client-facing folks this exact same thing when they ask wide open questions, and then complain that devs are frustrated or taking a long time to respond. Or maybe the dev team is falling behind on a new feature (of course they will when always having to context switch due to wide open query or too many unresolved details).
Goes back to starting from specific details, so the consumer remains locked into their original intent and context, as much as possible. Not everyone who clicks here and watches these videos is watching with pen and paper in hand taking notes for a college exam. In order to potentiate their involvement, specific repeatable easy details are key.
That wasn’t the goal here.
Sorry I don’t understand that response. Also I updated/clarified the message during your response. Not trying to preach. Just my observations of life applied to what was discussed.
I guess I didn’t understand the comment.
OK, well, if there is something that you don’t understand about it, can you point out the spot that is not understood? That’s kind of my case in point.
At any rate, thank you for your videos and contributions
Ollama tools got merged, the day after you mentioned it :-). Thanks for the push
Nice
Good stuff. And yes, I got agent-like behavior without the framework by prompting only.
And yes, no need for a specific model, or fine-tuning. prompt before the prompt works well.
i took your original example and implemented it without a problem. people are just so dense. I've been using it on my project since the original video without a problem.
Timely. I have been battling getting function calling to wotk right. Sadly, many of the examples out there don't work with different models, they seem to all assume OpenAI. I look forward to giving your approach a try!
Pure gold! Always appreciate your concise explanation and humour.
I think the biggest problem for you, is that most people will read something in docs or on a blog and then claim to understand. Then attack someone even though they dont dont actually understand what they talking about. You only understand once you implement and use the functionality.
And to your 2nd point. You 100% correct theirs no reason for a model to be finetuned for function calling, i discovered function calling with gpt3.5 about 3 months after chatgpts launch.
I wish. It’s pretty clear in the docs. They just see the feature name and assume from there. Thanks for the comment.
Despite your aversion to a reasonable display mode, both of your 'tools' videos make me say 'Whoop, it's not just me. Thank you.
I have no aversion to the reasonable display mode, which of course is light mode...
i came to this exact same conclusion. i was going crazy trying to get openai to follow an exact line-oriented function calling format (not json). in some ways, function calling would not exist at all; if it was not so hard to make the LLM just follow the format requested by the system prompt.
I was writing a form-filling bot, that switches to your language, and interrogates you until the form is completely filled out. When it responds, it is asked in system prompt to reply with a per-line format. Any number of responses that we will parse:
SAY {message}
SET {fieldName} {fieldValue}
SAVE
In response to "My name is Robert Jason Fielding" it could respond like:
SAY Thanks for your full name!
SET firstName Robert
SET middleName Jason
SET lastName Fielding
SAVE
SAY What is your social security number?
It is kind of tricky to make it obey the system prompt. And this is far far simpler than the openai function calling APIs. The basic idea is to wrap a database in a layer of language common-sense.
you are just the inocenpt bystander of this:
I keep finding ways to simplify complexification.
Awesome! I am glad there are people like you to simplify and RE-explain the basics to the "writers". ☺
I really appreciate you coming and stepping on the trolls' feet. Perfect 👌
I see no reason to get excited about incompetent comments.
Just chill and explain nicely 👏
Hi Matt,
I watched again this video with pleasure, and it got me thinking again :-). First of all, please avoid the trap of dividing your followers into lovers and haters. You produce top-notch content, and there's no need to apologize or dramatize (appreciating you irony).
Let me delve into a point that emerged in your demo/experiment, which is, in my opinion, more significant than the function calling "issue". You verified that the majority of open-source models available on Ollama are able to produce the expected JSON. That's somewhat surprising to me. This demonstrates, as you suggested, that the OpenAI function-calling fine-tuned models are just marketing, but wait. I remember that the old GPT-3.5 OpenAI "instruct" models like "text-davinci-003" were able to produce JSON (so function calling-JSON if you will), but subsequent chat COMPLETION models (fine-tuned for lists of system/assistant/user messages) weren't! So, my guess is that OpenAI released the function-calling fine-tuned models later to correct the chat-completion fine-tuning?! Ironic again.
But, back to the Ollama models-I'm still perplexed. Are these optimized for both (at the same run-time) CHAT completion and "function calling" (aka JSON outputs)? This could maybe be a topic for another video...?
By the way, it would be kind of you to share the code on your GitHub repo as usual, but anyway the video is absolutely explanatory.
I'll take a look at the "Tools" #5284 Ollama PR. In my opinion, standardization could help the community around Ollama, even if you demonstrated that any user-made schema does the job.
Thanks always for sharing great content. I appreciate your effort.
Chapeau
Giorgio
When it comes to function calling with lots of highly specific parameters I find that the models that I can run in ollama are simply uncapable of following the schema i provide. Whereas openai and claude do an excellent job following a large json schema. So when it comes to a mod being "trained for function calling" I think they mean trained to follow large and strict schemas well as thats really the main difference.
You will see a difference if your function call (basically json output) needs to follow a large and strict schema. Everyones examples are too small to notice the change.
Have you tried. I just tweaked my code to use 10 parameters per function. And gave a more complicated prompt. Worked just fine. But if you are doing that you probably have bigger issues outside of the model.
Welcome to the real world 🙂
I suggest a disconnected engagement approach.
Love your videos and your style.
what do you mean by disconnected approach?
@@technovangelist 'Disconnected Engagement'. Stay fully engaged with what you are working on and your goals but disconnected from trolls, detractors and negative feedback. All the best with your channel.
Got it. Thanks. Luckily the negative is a small fraction of the rest of the comments. And I don’t spend too much time on it. I had fun with this one though. Thanks for the comment.
Matt, it seems that you go manually through YT comments... would it be possible to use AI to help you with that somehow? 🤔
You make it sound like reading my comments is something I would want to avoid. Ideas come from comments. Connection comes from comments. This would be the last thing I would ever want to outsource to an ai or other human.
The libraries are creating abstractions over a document and we can forget where the abstraction layer ends and where the substrate begins.
I'm going to try out this pattern. Lots of libraries make it easy to swap out models expecting completions vs conversations... fewer libraries have a nice clean way to swap out models that handle function calling differently.
A gentleman right there.
that's amazingly simple, nice. I'm guessing one scenario where someone would still want an agent-framework is if the framework was a low/no-code workflow.
I'd love to see a video on whether running models with GPTQ quantization is worthwhile. Most explanations I've seen amount to "GPTQ is for GPUs, GGML is for cpus" without saying why GPTQ is completely neglected in projects like ollama, or if there is even a meaningful advantage to either at this point.
quantized models are fine .. they work as well as the original full precision in general !!
Speed is ALWAYS dependant on the system !
My favorite video of yours to date. Actually the example clarified some questions I had, so thank you. I personally hope you make more mistakes 😉
Ngl he had me in the first half 😂
Would you mind answering a question? With you use this do we essentially pass in the instructions each time we prompt? For example if i tried to do this manually, i would just repeat my instructions each time (which also include the formatting) along with the new question? I wasnt sure what system prompt vs user prompt means, do they both end up in the same place anyway
I've been playing with AI and tools for quite long and I just got the same opinion as you. It's not complex at all but it does bring a lot to the table.
Also, agents frameworks are not needed for most use cases.
Great springboard on the subject matter. Clear, to the point.
As you grow in popularity, you may experience that your closest supporters will apply the greatest scrutiny. It doesn't mean you're disliked, no matter the perception of tone.
@@emmanuelgoldstein3682 problem was only felt like it was incomplete ! .. as everybody has been giving the same incomplete tutorial ..
can the model do chain of thought before outputting JSON? if yes how to seperate between the JSON output and the chain of thought?
Separate the concerns
Awesome video!
So glad you made this video! Could you perhaps go into why apps like the ones that assist with coding or app creation that use function calling may fail with local models, but work seamlessly with the cloud models? I think this is an area where people are struggling based on the many issues I see in Github repos.
I think a lot of folks don’t realize that function calling is possible in ollama. There are folks who seem intent on spreading the notion that function calling is more than it is. And so they kind of brute force their way through rather than taking the simpler approach. But that’s just a guess. Can you point me to some of the issues you have seen?
Considering openAI as the ONLY solution is not smart. In a lot of usecases you can get away with opensource models like llama3, mixtral, deepseek etc. And try not to blame Ollama its just a library to run quantized open source model locally, and give you a API interface just like OpenAI 😆
It is incredible how some think OpenAI is the only solution that deserves to exist.
Thanks for your videos and demo code Matt... very helpful.
And sorry that some people are nasty and hateful. There os no need for that. It is sad that some people feel they have permission to vent their anger and negativity and harm others.
Hello Matt thanks for the updated example, i had been stuck there too, but never thought once about insulting you beause of my lack of experience 🙂.
What is the name of local search api, you used ?
Searxng
Please can you point to the websearch tool used?
Searxng
can you explain or some useful resources to learn more about introspection and reflecion?
i think that ollama implementation of function calling is by forcing `{` tokens at the starting to force the model to generate function call.
correct me if i am wrong.
I don't know the details but I am 95% sure that has nothing to do with it. I am pretty sure its a gbnf grammar that was set up back in October.
Can I say that, if my prompt is clear enough, I can have function calling using any module? Since it’s just helping the software to decide which functions to call, right?
Thanks for the explanation, it’s mind blowing to me
The important part is to use format:json, and specify to output as json in the prompt.
@@technovangelist interesting, our makes me wonder how does ollama guarantee the output from any LLM model will be in json format?
I accept your apology.
Whew!
Matt - haters suck. You’re doing great and it’s awesome that you’re willing to share your wisdom and knowledge. Please ignore the jerks, we’re surrounded by ass holes.
If I ignore them I don’t get to do fun things like this video.
Is it possible to use function calling with tools with Open-Webui?
I dont use OpenAI function calling at all - it's just a wrapper for JSON conversion and interpretation of the output, and I'd rather keep control of that myself to make it more portable between LLMs. Why would anyone write something that is locked to a LLM interface definition when we live in such a turbulent world. I'd encourage everyone to do the same. I cant honestly see any benefit in using the "function call" feature versus rolling your own.
@@mattgscox bot gonna lie. I end up finding myself in your position. And realize OpenAPI specs is the answer to it. I don't want to give away my digital independence.
funny, but also three minutes of my life...ill go back to watching the original. i was half way to trying it, when some dilwad distracted me from that excellent content.
update: The code eventually returned 200 success in an empty message. If i ask on the ollama interface, I do get Berlin. Probably an ollama update.
Like a boss!
Haters are just insecure. Great work.
100% clear. thanks
Glad it helped
Yeah, I am not getting consistent function names. Model keeps changing them. Parameters are good. So for me it’s not stable no matter the model I use
interesting. would love to see the code you are running. I haven't been able to get it to fail ever.
@@technovangelist I got it now. Forgot to stringify the json object.
So basically I have it working in JavaScript, including agents
Gave me a good laugh.
lol. love it. spread more knows
I dont know if i will ever be forgive you for this. How could do this to us 😂❤
lol
😆I felt like this was a dev version of this vid: th-cam.com/video/0Szj21arytU/w-d-xo.html. Really enjoyed this one. As always, thanks for putting out great and useful content!
Really Gemma can do this? From the examples I've seen that model is pretty dumb, so if an SLM such as this can do function calling I'm impressed
That was gemma2 I think.
actually that bit was interesting to see that every single model produced not just the correct output but the right out put....
personal;y i have found that using such techniques means after you get your final response you will need to unload and rerload the model or clear the cache so the model can prepare for the next question ?
If you are having to unload and reload there must be something very strange with your setup. Is this with ollama? Have you updated to the latest versions? There is no need to do such things.
Great!!!
🔥🔥🔥
most of youtuber doesn't care about what their viewer agrees or disagree (and how they said it) but you handle it as if they are part of a....? community... ? a.... companion along ollama adventures... ?
in the first place most youtuber doesn't care and move on to next video at the end those nasty words are just a comments and when newer videos come up those disagree-er would come back to watch newer video.... it also happen with wes roth channel, david saphiro channel, even Kamph channel....
well anyway, I've been watching TH-cam unreasonable long, i don't have local TV or Netflix all i got are smart tvs, android boxes, tablets all over my places, at the office at my room at my home at my car everywhere they all mostly 24 hour playing TH-cam videos. and matt you're the only one after all these years a youtuber whom really care and serious about what you said and the recent event was surprisingly handled in deferent degrees. you're treating your channel in different way it is interesting way of youtube-ing don't stop matt, unless you got private issue.
I think that is part of my background as an evangelist or as some companies call it, a dev advocate, though that’s a misleading name. Build a community, have conversations, relay feedback back to the team. I have incorporated so much feedback into my videos every time. Thanks for the comment and thanks for noticing.
yeah function calling is just making llm to choose what function to use and specifying the required param as structured output. I am amazed how dumb people are ... just try to code up simple example and run it.
no not dumb .... they are many components to an ai system you can just use inputs and outputs ... but there is alot more you can do with a base model !
as we amy see a tutorial or example of your Mistral model , flying your RC helecopter !
Sooo, that was exactly what I wrote on your other video? That you can use ANY model for this, as long as it returns json in the response. As it's the "calling party" that actually runs the code/function, it has nothing to do with the model itself. But you claimed this was "added" to later models? I am confused.
Hmm not sure what other video you are referring to. But if I said something that sounded like I suggested it was added in the model I was simply not stating what I meant clearly. Function calling was added to ollama in October or November. So later than the initial release in June. That’s what I would have meant.
@@technovangelist Ok, you wrote that it was added in Llama 2, which is a model. If you meant Ollama, it makes more sense. However, what exactly prevented me from doing this with the very first version of ollama? As long as I make my own scripts that talks directly to the ollama API, why would I not be able to "ask it to return json" and simply run functions in my script based on the response? That is the part that I still do not get. Why would any type of "support for function calling" need to be added to either the model or the "wrapper" (ollama in this case) for it to work?
If you did that at the beginning the answer would have probably been something like: “sure, here is the json: {…”. It wouldn’t have been just the json. Folks were adding instructions like no prose etc to get the model to follow the instructions
@@technovangelist That's odd. It worked perfectly fine for me to say "only respond with a json object, nothing else" even on the very first models. Anyways, doesn't really matter.
haters will hate, but you rock man! ty for your videos !
Thank you, thank you. you just confirm my thoughts and totally clear my doubts. Thank you, thank you.