at 8:34 many have pointed out that I don't have the deep think r1 button pressed. Which if not pressed will then use deepseek v3. Thanks a lot for having such keen observation. Regarding the issue. I actually did find it out while I was testing and used deepseek r1 and the code that was generated was from deepseek r1 but in editing had to keep the zoomed part so to show the prompt I showed the one without the button pressed but be rest assured it's deepseek r1. :)
The crazy part about this is that the AI’s currently aren’t allowed to see/visualize the result of their code, so they’re essentially doing this blind.
hey, you are not using deepseek r1, just enable the deepthink r1 button below to see its full potential. since, you are not using deepthink the comparison is considerable dude
I have used it actually in one of the instances I send it without deep think and that was highlighted but I did change it back to deepseek r1 but that did show up In the edit. . Thanks for having a closer look at my video.
@@YJxAI yeah cursor also supports claude, Okay on question I used cursor for 1 month after that it says unlimited access has been over.. So does it mean I got some limitations? If so what is it
@@monsterasap1827 free trials I guess.. I did hesitate at firs but now I am on pro subscription and it seems unlimited to me. I daily do coding in it. It's worth it.
The Deepseek R1, OpenAI O1, and O3 Mini are all top contenders in the coding model space, each offering unique strengths and capabilities. Ultimately, the best coding model for a particular project or use case will depend on specific needs and requirements, making a thorough comparison and evaluation essential.
if you want to use Deepseek R1, just use it on huggingface or perplexity, they have hosted Deepseek R1 on their website, and they don't have Chinese censorship as well.
The battle for coding supremacy is on, but which model reigns supreme: Deepseek R1, OpenAI O1, or O3 Mini, each with its unique strengths and capabilities.
The battle for coding supremacy is heating up with Deepseek R1, OpenAI O1, and O3 Mini, each boasting impressive capabilities, but only one can be crowned the best. Ultimately, the choice between these models depends on specific needs and preferences, as each excels in unique areas of coding and problem-solving.
I think that the new LLM models which are being launched come pre-optimized for simple questions like snake game, calculator, 9.11 vs 9.9 etc. 😅 But if you ask a different question, they do not give a proper answer😊
just look at API usage comparison by amount of tokens, thats why no reason add Claude here, cuz it's still in top despite that fact it has the oldest update
O3 mini doesnt cost that much as it seems cause claude is combined 18$ vs 5.5 $ of o3 mini which seems to be high but seeing my api usage i can conform its actual equal or cheaper. But even ignoring that claude has some points to its favour which i think of discussing when i get time but have to test more o verify it.
Interesting stuff. good to see how they perform. But I think the best will be using the most commonly used coding languages, since those will be most commonly used.
yeah actually I generated and I didn't see any thinking tokens. But then I corrected it and clicked the deep think button but at that time I was not speaking so that didnt' show up. TLDR Dont' worry guys it's r1. But thanks for paying so much attention . ❤️
o3 did maybe a bit better but they could have been better if I used three.js but in that many assets are available already so looks good but not too hard for models. so wanted to make them do things from scratch.
To make a comparison I think everyone can understand. I think the current o3-mini-high is the equivalent t to a GeForce GTX 250, and to get real world performance that can do heavy work in real world example of this we need to get it through several revolutions closer to the RTX 3000 series and above. The GTX 250 could boot a modern AAA game, but at sub HD resolution with everything at low, but it's not good to look at when you get 20-30 fps.
yeah good comparison the only difference being they released cards every 2 years with around 50 % or something like that improvement whereas ai labs saturate benchmark in months. I think the exponential curve is steeper here.
It fluctuates on release very less time thinking. then in between sometimes more time thinking and now again. Maybe openai tweaks the time based on what is the general sentiment of people or is there any completion or it may just be a conspiracy
yes it's a model I love and have covered it as well I have also covered the flash and thinking model please check my channel. and 1206 is officially going to go pro soon. So stay tuned for that....😁
at 8:34 many have pointed out that I don't have the deep think r1 button pressed. Which if not pressed will then use deepseek v3. Thanks a lot for having such keen observation. Regarding the issue. I actually did find it out while I was testing and used deepseek r1 and the code that was generated was from deepseek r1 but in editing had to keep the zoomed part so to show the prompt I showed the one without the button pressed but be rest assured it's deepseek r1. :)
ai is also not safe from indian lecturing
🤣
The crazy part about this is that the AI’s currently aren’t allowed to see/visualize the result of their code, so they’re essentially doing this blind.
Exactly
hey, you are not using deepseek r1, just enable the deepthink r1 button below to see its full potential. since, you are not using deepthink the comparison is considerable dude
I have used it actually in one of the instances I send it without deep think and that was highlighted but I did change it back to deepseek r1 but that did show up In the edit. . Thanks for having a closer look at my video.
o3-mini is impressed me the most since Claude 3 Opus came. I can't imagine how good these will be in just 6 months, they will probably be true beasts.
Yeah o3 pro 😍
@@YJxAI But it's probably going to be too slow. Unless it's moderately fast Im not gonna use it.
@ I am okay with speed but te weekely limits yuck!
reminder that deepseek r1 was only a side project
In 6 months one of company will achive the AGI
That was a really nice comparison. Great video man. Definitely deserves more views. 💪🏻
thanks a lot
You shounld add claude in these comparisions too. It’s still a beast at coding
Do you use cursor. or any other agentic IDE and do you use Claude in it.
@@YJxAI yeah cursor also supports claude,
Okay on question I used cursor for 1 month after that it says unlimited access has been over.. So does it mean I got some limitations? If so what is it
@@monsterasap1827 free trials I guess..
I did hesitate at firs but now
I am on pro subscription and it seems unlimited to me. I daily do coding in it. It's worth it.
nice comparison! Love these type of video!
Hey everyone i dont wanna watch it all so please tell me which one
The Deepseek R1, OpenAI O1, and O3 Mini are all top contenders in the coding model space, each offering unique strengths and capabilities. Ultimately, the best coding model for a particular project or use case will depend on specific needs and requirements, making a thorough comparison and evaluation essential.
How is DeepSeek R1 even responding? Whenever I try, it gives the error: 'The server is busy. Please try again later.' Do you have a solution for this?
glad you asked. It works best around 1 am IST.
its called the art of video editing 😂
It works best around the time Americans go to sleep.
if you want to use Deepseek R1, just use it on huggingface or perplexity, they have hosted Deepseek R1 on their website, and they don't have Chinese censorship as well.
He isn't doing it live, but when it works. Amazing, eh?
Been waiting for this bro, thanks
🥹
The battle for coding supremacy is on, but which model reigns supreme: Deepseek R1, OpenAI O1, or O3 Mini, each with its unique strengths and capabilities.
Now this is what we call a coding test for SOTA LLMs, not some how many Rs or creating mock website
🥹
Please compare o3 mini with Claud 3.5 Sonnet
do you use it in cursor or any other ide
Yes I do. Claude sonnet 3.5 is better than R1 and o1. Don’t know about o3
@ better than o3 mini better than any other model in cursor by leaps and bounds.
you forgot to choose deepseek r1 when prompting a dragon
I corrected it don't worry but that was trimmed in the video.
@@YJxAI ok, keep up the work, your videos are one of the best for testing
The battle for coding supremacy is heating up with Deepseek R1, OpenAI O1, and O3 Mini, each boasting impressive capabilities, but only one can be crowned the best. Ultimately, the choice between these models depends on specific needs and preferences, as each excels in unique areas of coding and problem-solving.
yeah you are right
I think that the new LLM models which are being launched come pre-optimized for simple questions like snake game, calculator, 9.11 vs 9.9 etc. 😅 But if you ask a different question, they do not give a proper answer😊
that's why trying new things. I hope people like it
if your comparision dont have claude sonet, then your whole comparision took a wrong turn
just look at API usage comparison by amount of tokens, thats why no reason add Claude here, cuz it's still in top
despite that fact it has the oldest update
O3 mini doesnt cost that much as it seems cause claude is combined 18$ vs 5.5 $ of o3 mini which seems to be high but seeing my api usage i can conform its actual equal or cheaper.
But even ignoring that claude has some points to its favour which i think of discussing when i get time but have to test more o verify it.
Use v3 for coding r1 is for reasoning
Interesting stuff. good to see how they perform. But I think the best will be using the most commonly used coding languages, since those will be most commonly used.
Now hear me out, what if you could combine all of every single AI that is used for coding Into ONE? that'd be so OP bro oml
yeah
R1 is free whereas o3 and o1 are paid.
yeah big , very big plus point but be cautious with sensitive data on the deepseek website.
Not sure thats deepseek r1. The first dragon prompt isn't using the deepthink when you send the request
Yeah, you are right. That does make a huge difference.
yeah actually I generated and I didn't see any thinking tokens. But then I corrected it and clicked the deep think button but at that time I was not speaking so that didnt' show up.
TLDR Dont' worry guys it's r1. But thanks for paying so much attention . ❤️
That ball test was absolute failure for all models.
o3 did maybe a bit better but they could have been better if I used three.js but in that many assets are available already so looks good but not too hard for models. so wanted to make them do things from scratch.
Sonnet 3.5 is best coding model.
Do you have information about new claude?
3.5 new yes i have made 3 videos on it
I wasn't expecting Indian accent when I opened the video. However, thanks for the comprasion.
To make a comparison I think everyone can understand. I think the current o3-mini-high is the equivalent t to a GeForce GTX 250, and to get real world performance that can do heavy work in real world example of this we need to get it through several revolutions closer to the RTX 3000 series and above.
The GTX 250 could boot a modern AAA game, but at sub HD resolution with everything at low, but it's not good to look at when you get 20-30 fps.
yeah good comparison the only difference being they released cards every 2 years with around 50 % or something like that improvement whereas ai labs saturate benchmark in months. I think the exponential curve is steeper here.
@@YJxAI Definitely a lot steeper, at least, i hope they can continue the steep curve :)
bro talking about coding model but not about claude sonnet 3.5 i'm done
talked about Claude in more than three videos I am done.
AI is dumber than I thought.
Not really I think it's not trained on blander's dataset. Or they might forget it try finetuned version of this models or make one
R1 did better when I tried it myself. Something is off about your r1
is that just me or o1 now tries to generate the response a lot quicker than before by a lot ?
It fluctuates on release very less time thinking. then in between sometimes more time thinking and now again. Maybe openai tweaks the time based on what is the general sentiment of people or is there any completion or it may just be a conspiracy
Claude 3.5 sonnet is the best coder in my perspective
you use 03-mini-high, not o3-mini, (((
github copilot bro
these AI model is best AI in the whole world or they say big company dumbest AI in the world
Open ai 3 mini🎉🎉🎉
llms cant generate 3d objects like a dragon or cup lol, use different model for that test, diffusion 3d models.
bro thanks for you efforts
but i don't like the new benchmark
the old one with a table and hard reasoning tasks is better
The best coders are software engineers. Code from AI are a copy of code get from software engineers. Ai is hype and fraud
flash thinking ka naam suna he baap he in teeno ka
watch my video on it :)
@YJxAI video banai to isme kyu nahi laaye you are showing this model are best
Bro, please compare with gemini 1206 etc models. They are also good
yes it's a model I love and have covered it as well I have also covered the flash and thinking model please check my channel.
and 1206 is officially going to go pro soon. So stay tuned for that....😁