Thank you for keeping us up to date with the rapid changes that are happening.!! Can you do a short video on comparing all the agent frameworks which ones to use for different scenarios and depending on developer experience also?
Very helpful demo. I like these simple agent models like swarm and this, because as the models grow in power, these non abstracted frameworks will be able to get more done without changing anything. I wish this had the built in function to tool conversion that swarm had.
This is great! can this frame work help with edit long documents, clean them up from spelling mistakes and give titles and divide to paragraphs? in this video there was no memory management that goes beyond the token limits.
5:01 looks like the tool is just set to return the location and a fixed temp of 25C which is why it is 'working', or am i missing something, it tries multiple times with each option for celcius but its hard coded to celcius anyway.
Yes, in this case, it is just a hard-coded tool. The thing we are looking for is whether the model actually calls the tool and gets the response and deals with the response properly. You can see that for some of them, even though it called it, it kept calling it, whereas the better models just call it once, get the response back, and use that.
What would be the sweet spot for an LLM where it is free, open source, but good enough where agentic workflow can make up for the LLM shortcomings? Andrew Ng claimed that agents can do this, but your analysis casts a doubt on his notions.
Well, I don't think it's that small LLMs can't; it's more that SmolAgents is a pretty heavy framework that's more built to be a small framework *for large models*. I think if you break a problem down into a tree structure where every agent has one thing to do, and then propagate that information back to pooling agents that summarize more and more information and need fewer tools, you can probably get pretty alright results, even with small models. At a guess, I'd imagine Mistral Small is a pretty good balance of performance and cost efficiency, though, off the top of my head.
I've recommended your channel to everyone that asks me from where they can learn about cutting edge gen ai stuff!!
Thank you for keeping us up to date with the rapid changes that are happening.!! Can you do a short video on comparing all the agent frameworks which ones to use for different scenarios and depending on developer experience also?
Very helpful demo. I like these simple agent models like swarm and this, because as the models grow in power, these non abstracted frameworks will be able to get more done without changing anything. I wish this had the built in function to tool conversion that swarm had.
Thank you so much bro
Thank you. Nice video.
Since you introduced Pydantic-AI , it is my go-to framework. It would be great if you could add some advanced use cases videos for it 🙏🏼
They are certainly on the way. They added some changes around multi-agents so I will follow up with that
@samwitteveenai thanks, that would be amazing
This is great! can this frame work help with edit long documents, clean them up from spelling mistakes and give titles and divide to paragraphs? in this video there was no memory management that goes beyond the token limits.
I tried different models, but often get: `Input validation error: `inputs` tokens + `max_new_tokens` must be
5:01 looks like the tool is just set to return the location and a fixed temp of 25C which is why it is 'working', or am i missing something, it tries multiple times with each option for celcius but its hard coded to celcius anyway.
Yes, in this case, it is just a hard-coded tool. The thing we are looking for is whether the model actually calls the tool and gets the response and deals with the response properly. You can see that for some of them, even though it called it, it kept calling it, whereas the better models just call it once, get the response back, and use that.
What would be the sweet spot for an LLM where it is free, open source, but good enough where agentic workflow can make up for the LLM shortcomings?
Andrew Ng claimed that agents can do this, but your analysis casts a doubt on his notions.
Well, I don't think it's that small LLMs can't; it's more that SmolAgents is a pretty heavy framework that's more built to be a small framework *for large models*. I think if you break a problem down into a tree structure where every agent has one thing to do, and then propagate that information back to pooling agents that summarize more and more information and need fewer tools, you can probably get pretty alright results, even with small models.
At a guess, I'd imagine Mistral Small is a pretty good balance of performance and cost efficiency, though, off the top of my head.