AI Coding Comparison Challenge. 4 AIs build an HTTP Server in Python. See how they do.

Internet of Bugs

มุมมอง 37 379

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 23 ม.ค. 2025

ความคิดเห็น • 388

@InternetOfBugs 4 หลายเดือนก่อน ⁺⁵
iob.fyi/codecrafters will let you sign up to try CodeCrafters challenges yourself. If you're interested in seeing if you're smarter than an AI.
@Horsewithnoname88 4 หลายเดือนก่อน ⁺²⁴⁹
When all the dust settles and companies realize they need developers again, just remember their first reaction when they got their new toy was to eliminate developers.
@KevinJDildonik 4 หลายเดือนก่อน ⁺⁵⁶
Why isn't AI replacing management and CEOs? Why just developers and employees? Hint: Venture Capital is full of scams targeting people with big wallets and smooth brains.
@notMattGarska 4 หลายเดือนก่อน ⁺¹⁵
Oh we're not going to forget this
@Bilal-ys6df 4 หลายเดือนก่อน ⁺²
@@KevinJDildonik Because interaction with people is harder than interaction with machines.
@mykoladavydenko3578 4 หลายเดือนก่อน ⁺¹
@@KevinJDildonik Because responsibility on Managers and CEO is greater than on programmer. Manager is responsible for developers, and CEO for everyone. Often legally responsible. It is not ability issue, it is risk mitigation issue
@armoredchimp 4 หลายเดือนก่อน ⁺¹
Pepperidge farm WILL remember.
@falricthesleeping9717 4 หลายเดือนก่อน ⁺⁴⁸
let's gooo, another internet of bugs video!
yes finally someone actually tests it, even though most of the benchmarks are probably in the training data of many LLMs, but this was telling, thanks for this video
@ansidhe 4 หลายเดือนก่อน ⁺⁷
Great comparison! Thank you for your effort 🙏🏼👍🏼
@AB-wf8ek 4 หลายเดือนก่อน ⁺¹⁰
Thanks for this evaluation. As a non-coder I tried copilot last year to see what it could do and quickly realized it was useless for me, so I never wasted my money.
I've been watching the space, but haven't seen any comparisons like this. Really helpful for people like me who don't have coding experience, but want to understand where the technology is at.
@InternetOfBugs 4 หลายเดือนก่อน ⁺³
Thanks for saying so.
@ckatheman 4 หลายเดือนก่อน ⁺¹³⁶
Your statement at the end regarding MBA’s sitting in a corporate ivory tower, making decisions on layoffs based on hype is the real concern about this whole AI debacle and it is happening. This has the potential to set many companies back many many years.
@pchasco 4 หลายเดือนก่อน ⁺¹¹
Was it Oracle who said they e been able to reduce their programming staff by like 75% or something? Wonder how that’s going…
@ckatheman 4 หลายเดือนก่อน ⁺⁹
@@pchasco It won't last. Give it 6 months, especially if the economy starts to heat up in Q1 of next year.
@a_mediocre_meerkat 4 หลายเดือนก่อน ⁺¹²
Well if it's all true. Once the hype subsides they'll need us like never before to clean up their mess
That might be a good time to strike gold with contracts that will allow us to save up enough to retire ..
@hebozhe 4 หลายเดือนก่อน ⁺¹
@@a_mediocre_meerkatIt's like self-imposed Y2K prepping.
@horizonhunter_7296 4 หลายเดือนก่อน ⁺³
I swear every product I use, even just as a user and not a developer, is turning to shit since the AI push. Google products in particular, which makes me sad because I used to really love their products.
@charthepirate 4 หลายเดือนก่อน ⁺³²
This video continues the pattern on AI coding I see. All the skeptical/critical examples are extremely detailed, while all the times I see praise its extremely vague.
@Leonhart_93 4 หลายเดือนก่อน ⁺¹
Yes, clear signs where the logical side resides.
@RussTeeTrombone 4 หลายเดือนก่อน ⁺⁹
Yours are the only videos I can watch on this topic.
I get literally nauseous with disgust when I see chatbotshill content.
Cheers mate, excellent stuff
@2xbyx4 4 หลายเดือนก่อน ⁺⁴⁸
16:28 Can Confirm, Carl did enjoy playing with himself during the making of this video
@InternetOfBugs 4 หลายเดือนก่อน ⁺⁴²
@2xbyx4 LOL. I almost went back and re-recorded that, but I figured YOLO. I'm not EXACTLY sure what words I was trying to say (about having had fun playing with CodeCrafters), but it sure didn't end up sounding like whatever I meant. 🤣
@VasoTodorovic-lz5lf 4 หลายเดือนก่อน ⁺⁴
@@InternetOfBugs we love watching you playing with yourself no need to thanks us for that.
@isaacalves6846 4 หลายเดือนก่อน ⁺¹¹⁴
This was the most honest video about the state of LLMs for programming I've seen
@WallaceRoseVincent 4 หลายเดือนก่อน ⁺²
Sorry I do not agree. He isn't understanding the future of AI. It isn't just a probability machine.
@DougBow96 4 หลายเดือนก่อน ⁺³
Agree 💯, good video
@orthodox_gentleman 4 หลายเดือนก่อน ⁺²
I absolutely do not agree. This was a highly manipulative ad. So many people are under the influence of mind control! It is crazy!
@WallaceRoseVincent 4 หลายเดือนก่อน ⁺³
@@isaacalves6846 Here is what ai said about your criticism. "I understand where the creator is coming from.
The Al systems in the video did struggle with
some relatively simple tasks. However, I've also
seen Al systems do some amazing things. I think
it's important to remember that Al is still in its
early stages of development. There's a lot of room
for improvement, and I'm confident that Al will
eventually be able to do many things that are
currently impossible.".
@wedding_photography 4 หลายเดือนก่อน
Except he didn't use any of the top LLMs. Claude, GPT-4o, Gemini.
@HaiderKhan-6410 4 หลายเดือนก่อน ⁺⁷
Great video! It shows that while AIs are useful, they can’t replace the creativity and problem-solving of human developers. A must-watch for junior devs
@ThiagoVieira91 4 หลายเดือนก่อน
It's incredible this content is free on the internet. Take my money, Lord of the Bugs. (tried to post this comment in the long form video, but the feature is not enabled there...)
@pchasco 4 หลายเดือนก่อน ⁺⁸
You’re doing the lord’s work. Keep it up!
@farhanaliqureshi3908 4 หลายเดือนก่อน ⁺³
Great work Carl, appreciate your more sane approach to test these LLM based code generator rather than make-believe results.
@AryadevChavali 4 หลายเดือนก่อน ⁺²
Best intro you've done so far, and that's a pretty high bar to beat lmao
@techfixer1543 4 หลายเดือนก่อน ⁺⁴
Carl your "straight-up" objectivity, with comparing these AI Coding Generating Tools, is what keeps me watching (and learning) from your many years as a software engineer/developer. Yes, please do more AI demos! Really enjoy how you lucidly convey your thoughts. And you're right, watching you type would be boring as "F*@^. HA! 😜
@LL-dv2vf 4 หลายเดือนก่อน ⁺³
Great stuff. Really hope for comparisons of more complicated tasks. I bet they're entirely different and the AIs reach a breakthrough.
@yuriy5376 4 หลายเดือนก่อน
😂
@NitinPatelIndia 4 หลายเดือนก่อน ⁺⁶
The challenge you gave to AI has been solved by lots of people in the exact same way & posted on github & most likely all the code is already there in the training data of the big models.
I tried a few codecrafters challenges & my cursor copilot was finishing the code for me following the exact requested spec before me & I just had to tweak the code.
So, it'd be interesting to see how it does in a brand new challenge which doesn't exist anywhere yet.
@koshydigital 4 หลายเดือนก่อน ⁺⁴
I love how thought out your videos are. Thank you for you videos.
@ypayasoso 4 หลายเดือนก่อน
Thank for this enlightening video. It seems more like you were a patient instructor helping the AIs as if they were new dev students rather than the AIs helping you (the person who paid for their help).
@IanWeston 4 หลายเดือนก่อน
Thank you for this fantastic video. I especially like the methodology you chose to evaluate these tools. Would love to see more demos in this format.
@EduardLepner 4 หลายเดือนก่อน
Спасибо!
@Strenkoo 4 หลายเดือนก่อน
This is a great comparison / general test of AI. First time being recommended your content and I'm excited to see more
@bananerz3167 4 หลายเดือนก่อน
These are the best videos on ai and I always show these to my programmer colleagues whenever they make uninformed statements about how the ai would replace our jobs . This just confirms all my suspicions about the limitations and this just scratches the surface. I work on much larger code bases with more difficult challenges and it can't even do these easy tasks u show here properly. I m not worried about my job security at all
@bnchi 4 หลายเดือนก่อน ⁺³⁷
I've done the CodeCrafters http challenge and the description is clear enough for an AI to write the code, like the requirements are so constrained for an AI to do its thing. Imagine having to start introducing edge cases or start thinking about maintaining the crap that you didn't write.
@ckatheman 4 หลายเดือนก่อน
@@bnchi The reason it’s capable of even doing that is that the code for building an HTTP server is readily available on the Internet and has been for many years. It was ingested as part of the ML learning process. It’s not creating anything new. It’s not creating anything you couldn’t already find on stack overflow or GitHub or a coding book or anywhere similar.
@seeibe 4 หลายเดือนก่อน ⁺⁵
Yes, all those things are why AI won't replace developers, but it's still way more efficient to work with the AI than do it all on your own. Regarding introducing edge cases -> This is more a function of how well you structure the code. AI allows you to write and refactor code faster, so overall in the same time you can produce code that is *more resilient* to edge cases. Same goes for maintenance. AI allows you to much more quickly refactor code into its own functions for example, which reduces spaghetti and increases maintainability. Your criticisms really only apply if 1) You don't give detailed instructions to the AI about the code you want it to write 2) You don't iterate on the code generated by AI with more instructions and manual interventions
@bnchi 4 หลายเดือนก่อน ⁺⁶
@@seeibe I often use my editor refactoring tools to move a highlighted region or a block into a function, inline code, change function name across the entire codebase etc .. the editor is way better in these tasks than an LLM because they're working with the actual AST of the code so the editor know a lot about the code than a guessing tool like LLM. Having to let the AI do architecture decision or name suggestions in my code always been a failure and I have to describe things in multiple iteration to then get something that I would have written way faster.
@a0flj0 4 หลายเดือนก่อน
@@seeibe What AI does is speed up things. Whether you're an expert who routinely writes well structured and efficient code or a beginner who regularly produces crap, what you produce won't change, you'll just produce it faster.
@ekowstevens4054 4 หลายเดือนก่อน
As a member of the general public I do need to keep a search enging in another window to explain what you are doing. But it is clear you are actualy putting the AI through rigorus tests applicable to your feild of expertise. So much of the media is taking these companie's claims at face value. Thank you for the time and effort you are putting in to this and for spuring me to find out more about codeing.
@copypastemyname 4 หลายเดือนก่อน ⁺¹⁵
I'm very interested to see how Cursor performs with medium tasks. Judging by the way the "very easy" tasks look, using AI tools to write the code and then debugging them takes about the same time time aswriting the code yourself.
@giorgikochuashvili3891 4 หลายเดือนก่อน ⁺³⁹
Imagine a person who has no clue how to code sits down and try this
@seeibe 4 หลายเดือนก่อน ⁺⁵
Not sure but I'd expect it to be a great tool for learning? Even now when I want to learn a new tool, I will let the AI generate the code for me, and then have it explain the parts I don't understand. Way faster and more hands-on than first painfully diving into the documentation and exhausting yourself before you can write your first line of code.
@Teodor-ValentinMaxim 4 หลายเดือนก่อน ⁺⁴
@@seeibe Maybe, but the quality of your learning process degrades, you believe you dont cuz look how fast you are learning, but it actually does degrade it. AI is meant to help you automate boring, mundane tasks that you already know and can fact check easily. It isn't meant to be used as a tool for learning.
@giorgikochuashvili3891 4 หลายเดือนก่อน ⁺³
@@seeibe sure i get what you say but learning is about reading and then using that knowledge to make something when you use AI you don't do much it gives you chewed food basically.
@fernandocorreia2091 4 หลายเดือนก่อน ⁺⁷
Because of the Dunning-Kruger effect they will have no idea of the many ways in which their end result is broken.
@seeibe 4 หลายเดือนก่อน ⁺¹
@giorgikochuashvili3891 A lot of what you talk about there is cross referencing knowledge in our brains. With the current state of human knowledge you'd have to hyperspecialize for that to be effective. I rather be a generalist with a broad knowledge, and let the machine take care of dredging up the detail from its vast pool of knowledge.
@Westernaut 4 หลายเดือนก่อน ⁺⁶
Yes, keep separating the hype from the value. This is an honest service.
@fennecbesixdouze1794 4 หลายเดือนก่อน ⁺⁸
If your company is trying to replace developers with AI, what it tells you is they do not have any clue how value is created in a technical organization. If they think they can replace developers because AI is able to "write code", they are not just factually mistaken about whether AI can write good code, they are more deeply mistaken that merely coding is the primary value added by their technical team members.
@ManInSombrero 4 หลายเดือนก่อน ⁺¹³
Days without attacks on Python: 0
@alexvechirko_ 4 หลายเดือนก่อน
Thank you for your video, was interesting and unexpected in the end) Waiting for the next experiments with typescript and/or rust (*or any other compiled language)
@oscareriksson9414 4 หลายเดือนก่อน
The intro convinced me I want to see comparison videos a lot more
@jsega-di6lu 4 หลายเดือนก่อน ⁺²
Great review, thanks. Especially useful right now with all the hype around Cursor, at least on twitter (apparently deserved at least relatively speaking).
@scriptoriumludi5698 4 หลายเดือนก่อน ⁺²
Great stuff, I've been looking for exactly this type of breakdown
@codee_script 4 หลายเดือนก่อน
Your content is very good. I am proud to be your subscriber.Wish best luck❤❤❤❤
@Nex_Addo 4 หลายเดือนก่อน
Oh man, as someone who's likely going to have some demo or another that would fall under your purview of interest, I'm excited someone doing this the right way!
@antonpodkur3520 4 หลายเดือนก่อน
Man that's the analytics we have to get from the media. Big thanks for your hard work
@neptronix 4 หลายเดือนก่อน
Thanks for this comparison, i'd love to see videos with harder problems.
@williamwade8119 4 หลายเดือนก่อน ⁺¹
Keep up the good work. Looking forward to the next video!
@PortaFi 4 หลายเดือนก่อน ⁺¹
12:16 syntax errors aren’t affected by try… except block. Python checks syntax at the start during compilation to bytecode, and try…except block catches runtime errors
@jzxynow2a8gs21 4 หลายเดือนก่อน ⁺²
Subscribed, pretty much what i expected, can you also add replit's agent given all the hype it been getting ?
I also think this should be redone every 1-2 months as a series given the current hype cycles.
@InternetOfBugs 4 หลายเดือนก่อน ⁺²
Haven't tried that one - I'll go take a look. Thanks.
Not sure about repeating it - we'll see. It's a lot of work. As long as people keep watching them, I'll keep making them, but I'm concerned people will get bored and it would be wasted time and effort.
@rafaelmalgor 4 หลายเดือนก่อน ⁺¹
Nice video. Keep up the good work!
@Maisonier 4 หลายเดือนก่อน ⁺⁶
What about AIDER with Deepseek Coder v2 running in a local machine?
@mihaitanita 4 หลายเดือนก่อน ⁺⁵
Sharing the actual prompts to the AI to resolve the problems would make a difference. If the prompts were merely a copy&paste from the CodeCrafter page there is no wonder that the the AI failed - there is a specific way to "talk" to the AI, keep in mind to break the problem into tiny steps.
Secondly, all the coders mentioned do some RAG of your existing interactions with your code => crappy results at first usage. (still not in the chat mode) - so I expect more accurate results by pasting those problems directly to the web interface to GPT4o or Claude 3.5.
@InternetOfBugs 4 หลายเดือนก่อน ⁺¹³
I'm not interested in the "how can a skilled programmer best use an AI" question. I'm interested in the "how well can an AI emulate or replace) a skilled programmer?" question.
@davidhatran3160 4 หลายเดือนก่อน ⁺²
This statement reflects a weird fact that if you play by AI's rules/prompts, the outcome will generally be better (similar experience with autonomous driving, it takes sometime to adjust to the driving algorithm). It means for AI to perform well, human now has to adapt to AI instead. That perspective is either depressing (if AI becomes dominant) or unrealistic (if AI continues to require it). And then we are also asked to trust a system that NO ONE even its creators understand well??
@mwwhited 4 หลายเดือนก่อน
@@InternetOfBugslol… watching you videos on this topic reminds me of the videos of other creators using LLMs to write a programs that would take an entry level devs 15 or 20 minutes and able to pull it off in just a few hours of prompting.
@hackmedia7755 4 หลายเดือนก่อน
I think this comes down to the information involved in describing what you want. For example what if I told the prompt. "Make me an app that will be successful".
It doesn't have enough information. There are so many paths that it can take because of ambiguity. Most of the AI is just reuse of templates that are already known. To innovate you would need to manually create new ideas at a fundamental level. So then in the future AIs are reduced to the idea creating capabilities of the user.
@dipi71 4 หลายเดือนก่อน
Never thought about Python's syntactically relevant indentation tripping up LLMs, but it seems obvious now. Cheers!
@torarinvik4920 4 หลายเดือนก่อน
Yes more videos of this.Perhaps showing what to do yourself and what the AI should automate for you.
@theobrucker2057 4 หลายเดือนก่อน
This was a great way to compare and test different integrations. I would love to see a more "realistic" test would implement some generally understood best practices for prompt engineering (since this is often the use case for devs who try these tools)?
@Leo_ai75 4 หลายเดือนก่อน ⁺³
Great use of screen of death!
@vectoralphaSec 4 หลายเดือนก่อน ⁺¹
What about with the new OpenAI o1-preview??
@InfiniteQuest86 4 หลายเดือนก่อน ⁺²¹
I was actually pretty mad when I heard the challenge since it sounded cherry-picked to be way too easy. I guess even in those cases AI sucks. I don't get why so many people I work with say it's so amazing.
@demonmonsterdave 4 หลายเดือนก่อน ⁺¹¹
Most people simply repeat what they are told.
@vytah 4 หลายเดือนก่อน ⁺⁶
AI can automate some tedious and boring tasks, and that's about it. Which is I guess nice, and that's why people like it, but it's only the easiest part.
@CaridorcTergilti 4 หลายเดือนก่อน ⁺¹
Excellent question, maybe they expect a drastic improvevement with the next generation? Maybe they are impressed that AI (deep learning) can do anything at all, given that it is such a new technology?
@RawrxDev 4 หลายเดือนก่อน ⁺¹
@@CaridorcTergilti Most certainly part of it, when ever someone actually listens to the criticisms with the current state of LLM's their next response is "well this is that worst its gonna get, it just gets more powerful from here" which just is not guaranteed.
@musashi542 4 หลายเดือนก่อน ⁺¹
bcause most people arent able to build an http server .
@CaridorcTergilti 4 หลายเดือนก่อน
Highly informative video, thanks for the objectivity and clarity
@hidroman1993 4 หลายเดือนก่อน
Awesome video, looking forward to more
@Gredias 4 หลายเดือนก่อน ⁺¹
Looking forward to seeing more. I find copilot/codeium handy in the moment to moment work I do, mostly when it fills in a line or block just the way I would have done, or saves me a Google when I can't quite remember the syntax for something. Definitely curious about cursor...
@RussellAshby 4 หลายเดือนก่อน
Great video. Please make more.
@warpspeedscp 4 หลายเดือนก่อน
Love it when your intro interrupts stuff
@felipesaopaulo9716 4 หลายเดือนก่อน
Nice summary, definitely made me want to watch the full session. Around 15:48 you mention that you wouldn't use it for work, outside of testing: did you mean testing the AIs or using them to perform/write tests?
@jit-r5b 4 หลายเดือนก่อน ⁺¹
I bet he means testing advancements in AI
@InternetOfBugs 4 หลายเดือนก่อน ⁺¹
Testing the AIs - as in making more videos like this.
@felipesaopaulo9716 4 หลายเดือนก่อน
@@InternetOfBugs phew!
@flyingwasp1 4 หลายเดือนก่อน
Great video, really helpful and informative
@DonnyLA 4 หลายเดือนก่อน ⁺¹
Interesting vid 👍 please do more comparisons 👍
@jarrodcrockett6499 4 หลายเดือนก่อน ⁺¹
Thanks Carl loved the video.
@SvenAlmgren 4 หลายเดือนก่อน
Very interesting, thank you :) I'm very interested of seeing this in C#
@stephenhookings1985 4 หลายเดือนก่อน ⁺⁷
I am so curious how far (your) left your bookshelves go :-)
@imflyingoverclouds 4 หลายเดือนก่อน
Content quality that everyone needs. Hypes are created people who makes money out of pumping fomo.
@dixztube 4 หลายเดือนก่อน
Boy you really do be going the extra mile lol. Thank you
@hotworlds 4 หลายเดือนก่อน
I'm also very curious about how they perform with new vs old languages. Anecdotally I've seen them get c much better than rust etc. It makes sense since they'd have more to train on but I'm curious how much better any one model would do on the same problem using different languages.
@jsu12326 4 หลายเดือนก่อน
great video, Carl! I shared it on my LinkedIn
@bigutubefan2738 3 หลายเดือนก่อน
There are loads of simple editor tools to run before commits that can automatically fix simple whitespace problems, e.g. Black. Basic syntax errors too. But point taken about Python.
@WaszInformatyk 4 หลายเดือนก่อน ⁺³
The One that makes honest, unbiased, in-depth tests of billion$ AI projects, may gain a lot of attention.
Monetization and value of those projects may be at stake 😉
@yuriy5376 4 หลายเดือนก่อน
I hope Sam Altman won’t stalk the author of the video and assault him
@valentinmeier5434 4 หลายเดือนก่อน ⁺¹
Hello Internet of Bugs, I have a question, what language do you recommend studying taking into account that today there is a lot of demand for js, java, python programmers and little supply, or offers full of applicants. I am a second year software engineering student and I am a little worried (not so much) about the number of applications there are for each position.
@r_j_p_ 4 หลายเดือนก่อน
Love seeing Welch's TCL and Tk book over his shoulder.
@logiciananimal 4 หลายเดือนก่อน
Great stuff - I wonder what this tells us about the merits of the *other* Copilots.
@ansidhe 4 หลายเดือนก่อน ⁺³
There’s an interesting hidden message here… all those LLMs learned based on the feed that was available to their creators, so they carry that „accent” in whatever they spew out. I bet the PyCharm one was so convoluted and most advanced of the four (albeit fumbling) because JetBrains fed it with higher quality learning material harvested from the code actual humans entered into their IDE. It doesn’t change the fact that they all fumble and stumble upon their own feet like ghouls in Fallout though 🙄😉
@reliyance 4 หลายเดือนก่อน
I think where current coding AIs are arguably the most useful is through their auto-complete feature. At least for an experienced dev I think it can save you time. I'd really appreatiate you running the same 4 AIs through their paces in this field - because the ranking could be totally different there (as it is often a different model than in the chat feature).
@kevin25699 4 หลายเดือนก่อน
Good luck with separating the Hype from the Value!
@gareth2021 4 หลายเดือนก่อน
great video, very informative thanks!. Did not think they would fail this hard lol. Maybe even AI has problem why weakly typed languages?
@julioeliseovallsmartinez5422 4 หลายเดือนก่อน
Hey, nice exercise! I did something similar recently to compare the "new kids on the block" and came to a similar conclusion. A couple of questions:
1. Did you use the default model for Codeium? You can also use gpt4 and Claude 3.5 sonnet with it (in a limited way).
2. When using copilot, do you mostly use the inline chat functionality? While it's true that its quality seems to be a bit lacking lately, correcting issues has typically worked well for me, but I tend to use the chat window and keep the right files added as context while using it. This might make it perform better, but perhaps you already do that.
Anyway nice video, as usual!
@flor.7797 4 หลายเดือนก่อน ⁺²
Isn’t there a free VScode extension that does the same as cursor?
@Lisekplhehe 4 หลายเดือนก่อน ⁺¹
I don't know, is there?
@user-93fekod1o 4 หลายเดือนก่อน
Yes there is, Cody. I use it. Last year I was using BITO or something like that.
Pretty decent honestly.
@rodrigo_t9 4 หลายเดือนก่อน
I think that inline chat with copilot has no concept of a history of file versions, does it? I see you talking to it mentioning a "previous version" of the code, which I'm pretty sure will be nonsense to the model as it won't have access to a "previous version". Wonder how much those things affect the results.
@Analyse_US 4 หลายเดือนก่อน
This man is going to get pushed down a lift shaft, by Satya Nadella😂 But, seriously, great content. Keep it up.
@InternetOfBugs 4 หลายเดือนก่อน ⁺¹
Nyah - I expect I'll get run over by a rogue Tesla CyberTruck in FSD mode.
@mehdi5738 4 หลายเดือนก่อน
Great job mister
@ibzilla007 4 หลายเดือนก่อน ⁺²
Give an agentic LLM product a try. Suggest you try Claude Dev. Perhaps it is a preview of Devin
@orthodox_gentleman 4 หลายเดือนก่อน
First comment I read that actually had some meat to it. This video was ultimately created as an ad
@howardelton6273 4 หลายเดือนก่อน
I came here for this. CLAUDE DEV is great!
@tonykaze 4 หลายเดือนก่อน
Two critiques: 1) you should absolutely be using API console with temperature at zero, in all cases. 2) using exact prompts for each isn't really a great test because different LLMs have significantly different techniques a skilled user would use to consistently get optimal results. Still, a fantastic video.
I personally did similar tests on python, JS, and Elixir and Sonnet 3.5 blew everything out of the water, and I also massively appreciated the much larger context window.
@InternetOfBugs 4 หลายเดือนก่อน ⁺¹
The scenario I was trying to emulate was the "unskilled (or at least unknowledgeable about programming) user." I'm more interested in the "can an AI replace a programmer?" question than the "how can a skilled programmer best be more productive using AI?" question. So I used the default AI configs and copy/paste prompts.
Whether or not that's the most important (or useful) question is a whole other topic for discussion.
@armoredchimp 4 หลายเดือนก่อน
I'm a relative beginner with about 2 years coding experience, and already even at my level of experience I frequently have to solve problems entirely on my own, and they end up looking nothing like the initial solution the AI (claude in my case usually) uses. Can't imagine what it's like for actual experienced veterans.
@snarkyboojum 4 หลายเดือนก่อน
I ended my Copilot subscription a few months ago. I've just been using my gpt-4o subscription and that's been ok. I'm not really into looking for a code-first AI again (yet). It's just not good enough and doesn't really help me beyond what I can do with gpt-4o at the moment.
@mickelodiansurname9578 4 หลายเดือนก่อน
Now in fairness I use Cody by sourcegraph which didn't make the list.. and it has increased my output, I'd say x3 at least.
However I've also been coding since the 80's, and when I started there wasn't even an internet. So I don't notice these errors you are pointing out and my guess is as a solid coder yourself you knew where it was screwing up but the rules of the game prevented you from letting the model know other than console outputs and errors etc.
So for boilerplate (and lets face it after 40 years at a console everything looks like boilerplate) well for boilerplate code its a fantastic time saver. It also allows me to think ahead at my coding strategy as the codebase in an app advances.
@karanvasudeva5424 4 หลายเดือนก่อน
The subtitles read "the company that made Devin was a new-ish startup and I certainly *can* recommend them". I'm sure you meant "certainly *can't* recommend them". Assuming the subs were transcribed with something like Whisper and not checked, this is, for this channel, pretty ironic.
@paca3107 4 หลายเดือนก่อน ⁺³
more comparisons!
@SchkuenteQoostewin 4 หลายเดือนก่อน ⁺²³
This guy is a bucket of ice water on CEOs who have drank the Koolaid!
@PlasticCant 4 หลายเดือนก่อน ⁺¹
koolAId
@lactobacillusshirotastrain8775 4 หลายเดือนก่อน ⁺¹
you might wanna review chatgpt o1 mini and o1 preview. this might be the version id be willing to pay for.
@InternetOfBugs 4 หลายเดือนก่อน
Hoping to start working on that (o1) video this week,
@OMikotakuO 4 หลายเดือนก่อน
Very interested in specifically the “AI no-code app builders” hype
@cdmichaelb 4 หลายเดือนก่อน
I was using copilot since forever, I had no idea it was so bad. Is it still on the custom fork of gpt3 for code, or have they updated it to one of the newer models?
@samwight 4 หลายเดือนก่อน
DO MORE OF THESE
Also I think Rust or Clojure would be fun. Rust has fucking confusing error messages if you're doing anything with threads. Wondering for Clojure if they can get the parentheses balanced right.
@HaiderKhan-6410 4 หลายเดือนก่อน
Awesome video, Internet of Bugs!
@changoviejo9575 4 หลายเดือนก่อน
I gave ChatGPT a problem I was not understanding totally, discrete mathematics from Epp's book. It had some conditions and the task was to build subsets from these. It couldn't do it correctly even when I gave it more and more extra data and explained things. If such basic things cannot be done, much less understand and manage a complex system.
@forivall 4 หลายเดือนก่อน
I wonder how well Google's stacks up. For compliance issues, it's the only one my work allows, but i only use it to stub out unit tests.
@InternetOfBugs 4 หลายเดือนก่อน
I looked at that, but Gemini seemed about twice as expensive as any of the others (once the trial period expires - at least according to cloud.google.com/products/gemini/pricing ) and that's comparing Gemini's full year's commitment price to Cursor/CoPilot's month-to-month price, so I decided not to deal with it. I might add it if i keep doing more of these, though.
@forivall 4 หลายเดือนก่อน
@@InternetOfBugs yeah, I think most of my team is just using the much cheaper generic chat interface when they need to use AI tools. I don't know if we'll use it after the trial period.
@JohnLewis-old 4 หลายเดือนก่อน
Please add Claude Dev to you list of options.
@icaromendes1250 4 หลายเดือนก่อน
More people need to see this video...
@cuentadeyoutube5903 4 หลายเดือนก่อน
It’s so funny, I canceled my copilot subscription after one year and moved to cursor, which uses Sonet, and then this video came along.
@unstoppablesd 4 หลายเดือนก่อน
Could you compare Cursor with Replit? Thank you for your great videos 🎉
@Leo_ai75 4 หลายเดือนก่อน ⁺¹
I’m enjoying this! Great video! I hope he uses cursor and Cody!

ต่อไป

เล่นอัตโนมัติ

Coding a Web Server in 25 Lines - Computerphile