Great showcase Mervin 👍 I played around with Phi-4 for a couple of minutes, and I say this with a heavy heart: they must have "gamed" some of the benchmarks if they claim to be better than "Gemini 1.5 Pro" at certain tasks. Unless there's a different version out there that we are unaware of, this one seems extremely average. I mean, it's acceptable for open-sourced small LMs to have limitations, but I don't think they are helping anyone by making these exaggerated claims.
Great showcase Mervin 👍
I played around with Phi-4 for a couple of minutes, and I say this with a heavy heart: they must have "gamed" some of the benchmarks if they claim to be better than "Gemini 1.5 Pro" at certain tasks. Unless there's a different version out there that we are unaware of, this one seems extremely average.
I mean, it's acceptable for open-sourced small LMs to have limitations, but I don't think they are helping anyone by making these exaggerated claims.
There are many new research documents that improve the way we were training models lok
hey @mervin
can you make a video on how to do function calling/tool calling using open source model from hugging face?
Ask this model, PHI4, about the longest beach in the world and then ask again... All the answers are incorrect.
Exactly. Even chatgpt etc. barely get things like that right.
@supermandem ChatGPT, Gemini and Claude will give the correct answer.
lol you guys don’t understand the use of this small models 😂
@@geelws8880 Good luck trying...