This is interesting. Interested in what you think of my opinion here. I think we should give the bot the broad level instructions with how the flow can and should go, for example the way that it responds to the user etc, in the system prompt. This is all UI stuff after all. The tool itself should have interfaces that are more programatic. For example the input should be "Date in ISO 8601" etc and the output should be "complete" or "done" or a data structure with the response. The LLM should (and in my experience can) then understand these input requirements and output messages, and it should be the one that generates natural language, as apposed to the tool returning natural language. This means, for example that we can change the style or language of the bot without changing the backend tool code. Cool demo though, thanks.
Nice - good to see a less censored model _with_ function calling 🎉Will hopefully pressure others to follow likewise as anyone who’s used ChaGPT will run across ridiculous “refusals” which can often be overcome by persisting…
So far I’ve found this to be the best LLM as a code assistant in producing Python code. Was also good seeing it have less censorship and the issues that brings.
Awesome 😎 !! And it's cool to see a "small" french company compete with big players ! Can't wait to see how Gemini will add some news to the Functions calling. I mean, I hope they will do a bit differently, more flexible Wouldn't it be awesome to have a LLM able to set up its own tools ?
I was going to make a similar comment. The LLMs need to be able to create their own functions, test them, then put the functions in their own(or public) library for reuse.
@Sam - Any chance of doing a video about Autogen Studio 2? I think that your style of video could do some justice to explaining it and extend on the idea of using mistral for function calling or "skills" as Autogen studio calls them.
I am just working on a vid for CrewAI but I plan to do a lot more content for Agents in general so will probably do a video about AutoGen though more from a code perspective
Is it possible/useful to add a system prompt with specific rules to follow by the model, before starting the proper conversation with the restaurant customer? Or are function calling and system prompt mutually exclusive?
This is an interesting model, though it's in kind of an awkward space where if I wanted to do something with it it's a bit impractical without having baremetal access to the model, so I'd generally just use OpenAI, honestly. If I want customization, I'd rather fine tune a model and run it myself, and if I want a big corporate model behind a wall I would just use OpenAI. I think it might be kind of interesting if they allowed various compute providers (Groq AI, etc) to provide it at a lower cost (and pay some sort of royalty to Mistral) or at higher throughputs so that people could do really custom, super high bandwidth solutions (like scaling with test-time compute) that can require thousands of responses to a single request to pick the most valid solution, as doing that is a bit impractical with OpenAI at the moment as I see it.
Great video. It would be interesting to know how it'd do if some of the required parameters are not given. Does it ask for them or will it fill them out arbitrarily? Cause this is a problem I've seen with OpenAI models, where they sometimes ask for the missing parameters and other times they fill them with arbitrary values, even when I make parameters required and tell it to ask for missing ones in the system prompt.
@@dansplain2393 I'm inclined to think they made it leaner for stronger reasoning cohesion and tighter parameters on inferences to prevent many of the previous shortcomings that would occur in aspects of hallucination. I'm willing to bet they've restructured their tokens, associations, again and are potentially using a larger recursion loop. Giving it the ability to stimulate "thinking about thinking" is the way to approach human level context awareness. Anthropic is already utilizing an aspect of this, but doesn't currently seem to be at this level in Mistral. I think you're spot on.
this presumes they have stopped training. It is quite possible/probably they are still training the base model even more - just realized you probably mean the context window? That could probably be extended , there are lots of approaches for that
@@IntellectCorner I see. So, it's just like the OpenAI GPT-3.5 turbo model, it's free if we use it through their chat platform but have to pay if we want to use its API. Am I right?
@@samwitteveenai I am using via API and putting a system prompt that states that it is uncensored. It won’t engage in sexual or any information like building naughty things. Best for me so far it mistral 8x7B DPO from togetherAI
Just wait. It will go haywire just like Gemini. I'm buying puts on microsoft. Ask Google what happens with there stolen code for that model. Just wait.
This is interesting. Interested in what you think of my opinion here. I think we should give the bot the broad level instructions with how the flow can and should go, for example the way that it responds to the user etc, in the system prompt. This is all UI stuff after all. The tool itself should have interfaces that are more programatic. For example the input should be "Date in ISO 8601" etc and the output should be "complete" or "done" or a data structure with the response. The LLM should (and in my experience can) then understand these input requirements and output messages, and it should be the one that generates natural language, as apposed to the tool returning natural language. This means, for example that we can change the style or language of the bot without changing the backend tool code.
Cool demo though, thanks.
Nice - good to see a less censored model _with_ function calling 🎉Will hopefully pressure others to follow likewise as anyone who’s used ChaGPT will run across ridiculous “refusals” which can often be overcome by persisting…
So far I’ve found this to be the best LLM as a code assistant in producing Python code. Was also good seeing it have less censorship and the issues that brings.
Interesting I haven't done that much with it for coding assistance will check that out more.
Awesome 😎 !!
And it's cool to see a "small" french company compete with big players !
Can't wait to see how Gemini will add some news to the Functions calling. I mean, I hope they will do a bit differently, more flexible
Wouldn't it be awesome to have a LLM able to set up its own tools ?
I was going to make a similar comment. The LLMs need to be able to create their own functions, test them, then put the functions in their own(or public) library for reuse.
These AI are definitely gonna rule the world someday.
@Sam - Any chance of doing a video about Autogen Studio 2? I think that your style of video could do some justice to explaining it and extend on the idea of using mistral for function calling or "skills" as Autogen studio calls them.
I am just working on a vid for CrewAI but I plan to do a lot more content for Agents in general so will probably do a video about AutoGen though more from a code perspective
Is it possible/useful to add a system prompt with specific rules to follow by the model, before starting the proper conversation with the restaurant customer? Or are function calling and system prompt mutually exclusive?
Can you create a video of Gemma with function calling?
This is an interesting model, though it's in kind of an awkward space where if I wanted to do something with it it's a bit impractical without having baremetal access to the model, so I'd generally just use OpenAI, honestly. If I want customization, I'd rather fine tune a model and run it myself, and if I want a big corporate model behind a wall I would just use OpenAI.
I think it might be kind of interesting if they allowed various compute providers (Groq AI, etc) to provide it at a lower cost (and pay some sort of royalty to Mistral) or at higher throughputs so that people could do really custom, super high bandwidth solutions (like scaling with test-time compute) that can require thousands of responses to a single request to pick the most valid solution, as doing that is a bit impractical with OpenAI at the moment as I see it.
I'm waiting for the part02😁
Great video. It would be interesting to know how it'd do if some of the required parameters are not given. Does it ask for them or will it fill them out arbitrarily? Cause this is a problem I've seen with OpenAI models, where they sometimes ask for the missing parameters and other times they fill them with arbitrary values, even when I make parameters required and tell it to ask for missing ones in the system prompt.
yeah in the 2nd example i show exactly that and you can see it asked for the time for the book as it already had the day
Thanks Sam. Very nice.
My pleasure!
thank you for the video!
What does native function calling mean in this context? nvm i found the answer in the video
Number of tokens is way too little compared to other models in top 5. But I am very happy that there is a non American solution available.
Do you think that was deliberate economising for training or inference?
@@dansplain2393 they don't have the resources of those giants so somethijg had to give.
@@dansplain2393 I'm inclined to think they made it leaner for stronger reasoning cohesion and tighter parameters on inferences to prevent many of the previous shortcomings that would occur in aspects of hallucination. I'm willing to bet they've restructured their tokens, associations, again and are potentially using a larger recursion loop. Giving it the ability to stimulate "thinking about thinking" is the way to approach human level context awareness. Anthropic is already utilizing an aspect of this, but doesn't currently seem to be at this level in Mistral. I think you're spot on.
32k is double the limit of standard GPT-4, isn't it?
If you compare the prices GPT-4 (32K) is 4-6 times more expensive that Mistral large.
this presumes they have stopped training. It is quite possible/probably they are still training the base model even more - just realized you probably mean the context window? That could probably be extended , there are lots of approaches for that
i want to build locally the zephyr-7b-beta.. is it possible on an intel mac 16 gb ram?
yes
You would be surprised at what all you do not need for a very efficient system. It's all a lye
Why there isn't a library inside LangChain which could automatically take care of OpenAI or Mistral function calling.
Did you read langchain docs?
there is function calling for a lot of the models in LangChain
But this is not free right? We have to pay for it if we want to use it, just like OpenAI GPT-4 model.
No buddy. It's free on le chat. I did try it and it's amazing.
It's a bit cheaper and it's fastest then gpt4
@@IntellectCorner I see. So, it's just like the OpenAI GPT-3.5 turbo model, it's free if we use it through their chat platform but have to pay if we want to use its API. Am I right?
@@davidw8668 ohh 😮. I wonder where I can find the pricing? I didn't find this info on the official website.
Can you test firefunction v1 as well?
what is this? I haven't heard about it.
Still seemed pretty censored in my testing, any tips?
i found most refusals could be gotten around by telling it was for a movie or book etc. Also craft the system prompt to explain that. Hope that helps
@@samwitteveenai I am using via API and putting a system prompt that states that it is uncensored. It won’t engage in sexual or any information like building naughty things. Best for me so far it mistral 8x7B DPO from togetherAI
can we take this model from huggingface
Just wait. It will go haywire just like Gemini. I'm buying puts on microsoft. Ask Google what happens with there stolen code for that model. Just wait.
Sam should trade Mark the Sam OKayyyyyy
lol I need to create a much catchy phrase.
It's a shame this place is full of thieves and shady people.
I have actually better result with GPT3-Turbo than GPT-4 when it comes to coding. Don't know why you guys find GPT-4 better.
for me 4 is much better for coding