Deepseek R1 + V3 combo with Cline is amazing. Id want to try Cerebras R1 70B in place of V3 next. If you you havent heard ov Cerebras, the make AI Wafer and the output token rate is nearly instant. Up until recent, they only had crap Lama 70b
o3-mini was useless for my agent, as it consistently failed to call any functions. Even when hardcoding the tool choice, it just responded with the function arguments as json string.
"Imagine five dead people" tonight for the first time hearing your tests I saw this or heard this differently. What if the prompt could be "five deceased people"
100% this. Give dead ppl could imply they will die since they are on the track and the train is coming towards them. It just assumes your English is so so 😅
@@sephirothcloud3953 perhaps another example of how it should be STEAM having Arts included. Likely da Vinci would agree. One step closer we are however
Deepseek is dog water in comparison. It's literally terrible at golang. You can tell it's a distilled sonnet model that mostly focused on prompt kiddy languages and frameworks. It tried to make an optional parameter in golang by adding an int parameter without a pointer and then saying if x = 0, then set a default value. I can't even begin to explain how many things are wrong with that. O3 refactored an entire feature in my go app in one shot where even o1 pro missed things after five minutes of thinking. But o3 one shots the whole thing in 10 seconds, and then even roasted me for the original code. o3 gives the same vibes as the jump we saw from sonnet 3.5 but even more so.
Deepseek R1 + V3 combo with Cline is amazing.
Id want to try Cerebras R1 70B in place of V3 next. If you you havent heard ov Cerebras, the make AI Wafer and the output token rate is nearly instant. Up until recent, they only had crap Lama 70b
I wll try this combo so far - at the moment I use Cusrosr and Windserf :)
Similar problems are with R1. CoT is useless if requests are misunderstood whixh happens a lot. 4o performs better than o3 mini
Mervin idea for next video: Model destilation!
Did you use 03-min low, medium or high? High is supposed to be better at answering the most difficult logical reasoning questions.
Gave this prompt to Deepseek, which reasoned through the logic correctly. Surprised and disappointed o3 mini took such a blind turn in its "thinking".
That sucks . There was so much hype over this .
I dont care honestly, lets just hope DeepSeek copies the upcoming O3 and releases it lol
Agree
It solved my automation problem. But there is no image upload, so not useful for me.
You should’ve also gave the question to v3 and r1
How do you use v3 and r1 in combo?
My experience with it in coding has been great so far
Thanks the The very first question gave me a very clear idea that it is a failed one o3 model
IMO it's way better than o1 for coding. 01 preview for some weird reason is still the best
03-mini it is faster, cheaper but dummer version of o1 - I do not see why I should use it - now I have R1.
Try Kimi 1.5 reasoning
That clickbait title thooo
Well, unfortunately o3-mini-high is a joke. o1-pro is so much better in reasoning.What they are claiming is totally false it looks like.
o3-mini was useless for my agent, as it consistently failed to call any functions. Even when hardcoding the tool choice, it just responded with the function arguments as json string.
"Imagine five dead people" tonight for the first time hearing your tests I saw this or heard this differently.
What if the prompt could be "five deceased people"
So often in English there is a use of the word "dead" less as an adjective
As in "already dead" like the mice in a cage with a cat inside.
100% this. Give dead ppl could imply they will die since they are on the track and the train is coming towards them. It just assumes your English is so so 😅
Your coding test has different problems for each video, which makes it hard to follow.
Your perspective on the trolley problem is not the only valid interpretation
You're wrong. Read the prompt again.
Bad model for my testing as well 🤔
10x better than deep shiit
Lool, cope more
This model was trained with STEM in mind, it sucks in everything else
@@sephirothcloud3953 perhaps another example of how it should be STEAM having Arts included. Likely da Vinci would agree. One step closer we are however
WOW this model is a beast.
This video sucks tho because it didn’t compare to deep
Deepseek is dog water in comparison. It's literally terrible at golang. You can tell it's a distilled sonnet model that mostly focused on prompt kiddy languages and frameworks. It tried to make an optional parameter in golang by adding an int parameter without a pointer and then saying if x = 0, then set a default value. I can't even begin to explain how many things are wrong with that. O3 refactored an entire feature in my go app in one shot where even o1 pro missed things after five minutes of thinking. But o3 one shots the whole thing in 10 seconds, and then even roasted me for the original code. o3 gives the same vibes as the jump we saw from sonnet 3.5 but even more so.
I will stick with deepseek, intel they release o3