Anthropic’s New AI Can Control Your Computer!

แชร์
ฝัง
  • เผยแพร่เมื่อ 23 ธ.ค. 2024

ความคิดเห็น • 327

  • @DailyTuna
    @DailyTuna 2 หลายเดือนก่อน +102

    Anthropic does seem to be the humble AI company. It’s refreshing not having the CEO doing speeches on grandiose visions. They just do their own thing.

    • @julien5053
      @julien5053 2 หลายเดือนก่อน +4

      Dario Amodei does interviews and share sometimes a grandiose vision. But yeah Sam Altman communicates a lot and is a bit too optimistic.

    • @smackyay
      @smackyay 2 หลายเดือนก่อน

      Sam Altman is an idiot

    • @jimj2683
      @jimj2683 2 หลายเดือนก่อน

      Anthropic is just as bad. It is a get rich quick scheme.

    • @LuckyDuck24
      @LuckyDuck24 2 หลายเดือนก่อน +3

      lol. You must not be that deep into AI yet. If you’re as deep as these guys are then you see a lot of big things it can do for the better. Like Altman says. The bad too.

    • @thomassynths
      @thomassynths 2 หลายเดือนก่อน +2

      It's a marketing tactic. The illusion is strong, especially when you can sling meaningless word around like "safety".

  • @Kylbigel
    @Kylbigel 2 หลายเดือนก่อน +64

    I love how these tech companies are diversifying their products and not all just doing text and images.

    • @thatonecommunist
      @thatonecommunist 2 หลายเดือนก่อน +4

      pretty much everything is just text images and audio.
      What else are you looking for, smell?

    • @Derick99
      @Derick99 2 หลายเดือนก่อน +12

      Action

    • @DreaMagnifier
      @DreaMagnifier 2 หลายเดือนก่อน +1

      ​@@thatonecommunist yup bro, humans don't just have ear and eyes,
      they have nose, tongue, and skin. You'll be happy when AI will have all these too

    • @franklin519
      @franklin519 2 หลายเดือนก่อน +4

      Embodied AI is next and then we are really off to the races.

    • @BoominGame
      @BoominGame 2 หลายเดือนก่อน

      @@thatonecommunist With an industry with so many developers in the loop, it's something you don't want, trust me.

  • @LatentSpaceD
    @LatentSpaceD 2 หลายเดือนก่อน +4

    Thank you , Matthew - always appreciate your enthusiasm and hard work :) !!

  • @pamudajayathilaka
    @pamudajayathilaka 2 หลายเดือนก่อน +64

    We definitely need a testing video fr❤🫡

  • @desmond-hawkins
    @desmond-hawkins 2 หลายเดือนก่อน +12

    The Compute Use feature is absolutely going to replace lots of menial jobs that have been too niche to automate, where it was too expensive to hire someone to replace the current humans doing data entry and copying forms into other forms. Reddit has a few threads where someone in such a job learned to code and either was fired and replaced by their code or felt bad being the top employee actually working just 2 hours a week on script maintenance. Now we're getting closer to many more of these positions being automated. I just hope that at least some of these employees will make their repetitive jobs much easier by learning how to use this kind of automation quietly without necessarily losing their source of income (esp. disabled employees for example).

    • @Matx5901
      @Matx5901 2 หลายเดือนก่อน +1

      C'est absolument plausible.

  • @MarcusExplainsStuff
    @MarcusExplainsStuff 2 หลายเดือนก่อน +7

    What is wild is thinking the metrics need to exclude o1. It is different but same. It should be considered in these metrics. If it takes longer but is right way more often that is a clear trade off for performance which can be highlighted and would be more accurate for average users to understand.

  • @bjensen60
    @bjensen60 2 หลายเดือนก่อน +2

    I think you're completely right about how computer interfaces will quickly fade away as AI becomes capable of simply performing any operations you ask without a predesigned GUI

  • @remaincalm2
    @remaincalm2 2 หลายเดือนก่อน +6

    You'd think they'd call it 3.6 to clearly imply an upgrade. Heck, even 3.5.1 would do the trick!

  • @CubeZanimation
    @CubeZanimation 2 หลายเดือนก่อน +8

    I just build an RTS Prototype in Unity 6 with Claude 3.5 and Cursor AI.
    In 8 hours ive gone from nothing to having a map, RTS Camera, navmesh, units, animations, state machine, Healthbars, movement etc. ... all working.
    In unity i only have an empty game object with a "GameMakerScript" attached. The script creates all Gameobjects when the games starts and controlls its parameters and links.
    It feels like magic.
    I only used 2 Assets (Skeleton Warrior and Human Warrior) including 3 animations (walk, idle, attack)

    • @Matx5901
      @Matx5901 2 หลายเดือนก่อน +1

      Ça a l'air génial.

    • @14supersonic
      @14supersonic 2 หลายเดือนก่อน

      Took you 8 hours for all that. Sounds impressive, but I'd imagine the cost was very high. Maybe around $100 USD?

  • @sapolsaikrasun2270
    @sapolsaikrasun2270 2 หลายเดือนก่อน +31

    It became too advanced it learned how to procrastinate. Next it will look up cat videos on TH-cam

  • @drummermike5150
    @drummermike5150 2 หลายเดือนก่อน +8

    Good stuff! Just fired up a Docker container using 1 command and it worked right out of the gate. Asked it to build a hello world app in JavaScript and actually did it without any interaction from me other than the prompt. Amazing! Cost me 50 cents but well worth it.

    • @cuentadeyoutube5903
      @cuentadeyoutube5903 2 หลายเดือนก่อน +3

      I called at the beginning of the year. AI is going to destroy the AI industry

    • @tollington9414
      @tollington9414 2 หลายเดือนก่อน +7

      As a professional hello world developer, this really scares me 😟 😢😮

    • @drummermike5150
      @drummermike5150 2 หลายเดือนก่อน +1

      @@tollington9414 I see what you did there. Good one. I always start out with something easy to make sure the system works with something that is as basic as it gets.

    • @jozonas
      @jozonas 2 หลายเดือนก่อน

      Your rhetoric makes it quite clear you don’t understand what you’re talking about. Keep up the good attempts.

    • @Matx5901
      @Matx5901 2 หลายเดือนก่อน

      Très intéressant, merci.

  • @hemiedwards217
    @hemiedwards217 2 หลายเดือนก่อน +4

    Looks like Claude has ADHD with that scenic sidetrack to Yellowstone National Park, lol.

  • @popopp2297
    @popopp2297 2 หลายเดือนก่อน

    Thank you for all that you do! I love your deep dives into AI! You've helped me more than you know.

  • @_WiseMass
    @_WiseMass 2 หลายเดือนก่อน +7

    Now Claude knows which volcano to work on in order to wipe out humanoids

    • @LuckyDuck24
      @LuckyDuck24 2 หลายเดือนก่อน

      Be scared. Vote Trump. Everyone is out to get us. (Teeth chatters and sweating instead of learning stuff and figuring out how to drive a car instead of riding a horse saying no way) lol. Messing with you dude. Really though buddy. Not scary. Did you know that when scary steam engines and trains first arrived they said women shouldn’t ride in them because the crazy g forces could explode their uterus. Look it up. You’re smart I bet. But stupid is as stupid does and again - that’s not you man. Go to.

  • @marc_frank
    @marc_frank 2 หลายเดือนก่อน +4

    i've had that same idea a few months ago. i told chat gpt to move the cursor to a pixel position by providing a json description of the action. then sent it a screenshot of what just happened. and do that recursively to achieve a task. it was able to order pizza. claude refused to output such json, so that was a bummer.

  • @collinpurcell986
    @collinpurcell986 2 หลายเดือนก่อน +1

    The Anthropic team has definitely been using this computer use feature to build out long agentic training data for coding. RLHF is now be giving a prompt for a whole application and giving feedback after potentially dozens of intermediate steps to dev it. 3.5 opus will be crazy if the distilled version of its most recent snapshot (3.5 Sonnet) is already the best in coding by a mile. We’re on the exponential boys!

  • @tiredlocke
    @tiredlocke 2 หลายเดือนก่อน +5

    I'm assuming that this requires a locally running agent to perform the local desktop actions? Giving a cloud-based AI model full desktop control to millions of remote agents seems like the beginning of a story I've heard somewhere before...

    • @BoominGame
      @BoominGame 2 หลายเดือนก่อน

      I haven't looked at the code, but in theory you could change the call to any AI, even local, like llama3.2, the problem is that it might not be optimized for the set of tools Clause is using, but worth the shot indeed, to keep everything tidy.

    • @kristianlavigne8270
      @kristianlavigne8270 2 หลายเดือนก่อน

      It runs in an isolated Docker container.

  • @rhadiem
    @rhadiem 2 หลายเดือนก่อน

    I too am glad to see the Agentic tests in benchmarks. Thanks for the video.

  • @orthodox_gentleman
    @orthodox_gentleman 2 หลายเดือนก่อน +1

    As you mentioned, Open-Interpreter can control your computer in OS mode and you can use any model. It is free and open source! The problem is when providers enforce rate limits.

  • @dougcampbell7898
    @dougcampbell7898 2 หลายเดือนก่อน +9

    A computer using a computer. Who would've thought that.

  • @mpvincent7
    @mpvincent7 2 หลายเดือนก่อน +1

    Very cool! Please test Matt! can't wait!

  • @mitchellmigala4107
    @mitchellmigala4107 2 หลายเดือนก่อน +1

    Matt, I like your channel. A clarification that would be beneficial for people is that Gemini 1.5 pros' math score is using 4 shot and Claude is 0 shot. So yeah, not exactly apples to apples on that score.
    One other thing about Gemini. Everyone is sleeping on Googles updated Gemini 1.5 flash. You need to show people how good it is (extremely fast, intelligent and dirt cheap). I'm not a huge fan of Google, but we need to be objective.
    Keep up the great work.

  • @Monotoba
    @Monotoba 2 หลายเดือนก่อน +1

    Can you imagine an office full of people talking to their computers to get work done...

  • @DailyTuna
    @DailyTuna 2 หลายเดือนก่อน +4

    I’m noticing with all this technology that it’s going to be important to have a business specific PC. Something that doesn’t have any personal information. It’s just business that way you don’t worry about giving some control away to the AI. That or partition the hard drive for different operating systems

    • @desmond-hawkins
      @desmond-hawkins 2 หลายเดือนก่อน +1

      That or for more companies to adopt the architecture that Apple described at WWDC for "private cloud compute", with verifiable OS images and lots of cryptographic proofs guaranteeing that data is not stolen. If you haven't read the long page they published about it, it's pretty fascinating and I certainly hope more companies go this route and focus on proving that their systems do not keep data that is being sent to them. It's much harder to build and maintain such a platform, and you don't keep anyone's data… no wonder no one had built such a platform. The main problem is that Apple is making money selling hardware, while many more companies will be tempted selling data.

  • @StoneShards
    @StoneShards หลายเดือนก่อน

    Looks like a good start. Voice interface will be huge from the user viewpoint. When every application comes with a built-in AI agent, the computer use AI won't have to figure out pixel stuff, it'll just tell the AI agent of the app what it wants, making the work go faster and more reliably.

  • @Kingbynature
    @Kingbynature 2 หลายเดือนก่อน +1

    Finally, been waiting on this :) Thanks!

  • @heiabjornholt
    @heiabjornholt 2 หลายเดือนก่อน +11

    Surveillance industry approves your enthusiasm!

    • @bitphr3ak
      @bitphr3ak 2 หลายเดือนก่อน +1

      The issue isn't that your dropping tracking data, the issue is: who gets to aggregate that, and for what end.

  • @therealsergio
    @therealsergio 2 หลายเดือนก่อน +9

    Imagine: "Claude, read all the new twitter posts in my feed, summarize the ones that we have discussed my interest in, and feed that summary to Google NotebookLM and generate a podcast for me.". My theory: in 2025, UI will be "the new API". And no, a UI is not as optimized as an API for access, but... in many cases where there is no API or its behind a pay wall, Agents will be using UIs as much as they use APIs.

    • @cuentadeyoutube5903
      @cuentadeyoutube5903 2 หลายเดือนก่อน +1

      I first thought “what a silly use case, you can just use an API” but then I remembered x and many others are becoming walled gardens and don’t provide easy programmatic access to them. This pierces through those restrictions. Your idea is great

    • @BrettWrightsPage
      @BrettWrightsPage 2 หลายเดือนก่อน +2

      This is probably why Sam Altman is so interested in Human Verification

    • @mao73a
      @mao73a 2 หลายเดือนก่อน +1

      I am predicting rise in amount of captcha quizzes not only during sign up but also on every submit button or even during page scroll.

  • @keithprice3369
    @keithprice3369 2 หลายเดือนก่อน +1

    If the coordinate system is flawed, maybe add an accuracy test before telling it to click or type. So you try to get it to mouse over a button but before it clicks you send another screenshot so it can compare the position of the mouse to the target and if it's not over the button, it adjusts. This loops until it's over the button. Not ideal, but it seems like a way you could make it accurate.

  • @philreese
    @philreese 2 หลายเดือนก่อน

    Please do a test video on this if you think it will provide value to the community. Thanks again Matt for your work to create useful and informative content.

  • @JeremyRabbit
    @JeremyRabbit 2 หลายเดือนก่อน

    Yes! Please do a demonstration video testing anthropic’s computer use on macOS so we can see an uncensored/unbiased test case.

  • @Rnjeazy
    @Rnjeazy 2 หลายเดือนก่อน +1

    This is the future I'm excited for! What a time we live in!

  • @Reflekt0r
    @Reflekt0r 2 หลายเดือนก่อน +2

    The only reason why we are not scared of that is that we are framing risk aware people as crazy.

  • @ghelmstetter-AI
    @ghelmstetter-AI 2 หลายเดือนก่อน

    Can you please provide links to the videos where you covered OS's designed for AI's (as mentioned ~09:40)?

  • @billyWhips
    @billyWhips 2 หลายเดือนก่อน

    Claude using claude is slick! I would love to see the inner claude also trying computer use, creating a mad infinity loop 😂

  • @danushkastanley1746
    @danushkastanley1746 2 หลายเดือนก่อน

    The best use case I see here is QA testing, as this functions similarly to Selenium or Playwright. I believe that in the future, we will be able to input test cases as prompts, and it will handle the testing. Additionally, with the advantage of a knowledge base, the AI could generate the reports as well. When integrated with agentic workflows, application testing would become super easy. I’m not even a QA; I work in DevOps, but looking at this, we would be able to accomplish anything in our day-to-day work. (Not to mention the job losses, haha!)

  • @timtim8011
    @timtim8011 2 หลายเดือนก่อน

    Yes, test it in depth! Please do it on your production machine, giving it full access to all your info and passwords, and full acess to the internet. j/k!!

  • @n0van0va
    @n0van0va 2 หลายเดือนก่อน

    I coded the same with 4o using playwright to automate user action in a browser. It's not fast but it work. Credentials are not sent to OpenAI. 4o is prompted to use some keyword that get replaced when text/password field are filled..

  • @kunwar_divyanshu
    @kunwar_divyanshu 2 หลายเดือนก่อน +1

    On Math GPT-4O is evaluated with 4-Shot COT while Claude is evaluated on zero shot

  • @jonasjohnson1958
    @jonasjohnson1958 2 หลายเดือนก่อน +2

    I am interested is seeing the amount of tokens the computer use model uses for different tasks. Please add this to your video on the model. Thank you.

    • @thenoblerot
      @thenoblerot 2 หลายเดือนก่อน +2

      In the docs:
      System prompt: 466-499 tokens
      computer tool: 683 tokens
      text_editor tool: 700 tokens
      bash tool: 245 tokens
      And it expects screenshots to be XGA/WXGA resolution, so: (1024x768)/750 = about 1048 tokens per screenshot

  • @siddharthv2701
    @siddharthv2701 2 หลายเดือนก่อน +1

    It’s a wonderful step forward but it’s far from ready yet.. it’s a lot if data to process so many screenshots and it falls too often and can not start where it falled

  • @machine69420
    @machine69420 2 หลายเดือนก่อน +1

    Great content as usual, but I don't know what happened to the audio in this video. I thought I was having a stroke with the stereo transitions.

  • @areacode3816
    @areacode3816 2 หลายเดือนก่อน +3

    I'm unsure how I feel about AI controlling my computer. Not for right now, my concern lies in what this could mean for hackers or hidden corporate control long term future.

    • @anthonyperks2201
      @anthonyperks2201 2 หลายเดือนก่อน

      The only real way to do it safely is to follow those safety protocols. Have your agent in a virtual machine, and only have on that machine the tools that you've validated they can use, and the sites they are allowed to utilize. It's a secretary for people that can't afford a secretary.

  • @jpoole4931
    @jpoole4931 2 หลายเดือนก่อน

    looking forward to seeing you demo it.

  • @slawinsky8951
    @slawinsky8951 2 หลายเดือนก่อน +25

    I had a problem in my Android app that I couldn't solve for two weeks, and Sonnet fixed it in just a few seconds 😍

    • @holdthetruthhostage
      @holdthetruthhostage 2 หลายเดือนก่อน

      Oh I hope we can build Apps with it

    • @thetrueanimefreak6679
      @thetrueanimefreak6679 2 หลายเดือนก่อน

      2 weeks is a hell of a long time

    • @vexy1987
      @vexy1987 2 หลายเดือนก่อน +4

      ​@@holdthetruthhostageanyone with a technical mind, a little motivation, and no prior experience can do this already!

    • @Ristaak
      @Ristaak 2 หลายเดือนก่อน +2

      @@holdthetruthhostage Sonnet has been helping me get into modding and coding. Whenever I hit a roadblock and I'm not sure what to do, even if Sonnet can't fix it itself, it usually can guide me to the right resources to figure out what to do next.

    • @BoominGame
      @BoominGame 2 หลายเดือนก่อน

      @@vexy1987 On Linux, on windows it's a bit more tricky, even with docker.

  • @merlingrim2843
    @merlingrim2843 2 หลายเดือนก่อน +1

    Yeah, it makes more sense for the UI to be generated by AI in real time rather than the UI pulling data based on a declared layout that tries to adapt to the data, screen size, orientation, usage context, etc. command line apps are probably better suited to AI automation at this stage.

  • @scottcastle9119
    @scottcastle9119 2 หลายเดือนก่อน +1

    All I want is text length to increase and to have no limitations as a paid user.

  • @timurista
    @timurista 2 หลายเดือนก่อน

    Claude's coding prowess impresses again. Can’t wait for more models! 🌟

  • @Upstatecashew
    @Upstatecashew 2 หลายเดือนก่อน

    Is there a step by step video to get this working on my windows machine to test ? im not a coder but would love to test this out !

  • @dynamic_uwu
    @dynamic_uwu 2 หลายเดือนก่อน

    i was trying the same thing with groq vision api,, but the main problem was the ai models don't know where to click, they can't determine the exact clicking position. Anthropic really did a great job.

  • @royykahangwe
    @royykahangwe หลายเดือนก่อน

    This is fascinating. Can´t wait

  • @Notepad123
    @Notepad123 2 หลายเดือนก่อน

    What’s the vid about building OS for AI called?

  • @dr4g0n76
    @dr4g0n76 2 หลายเดือนก่อน

    I guess the windows automation ids could be extended, or a model on top, to enable computer usage.

  • @rhaydon
    @rhaydon 2 หลายเดือนก่อน +1

    Amazing new feature that I can’t wait to see in the wild. One concern: what happens when nefarious actors discover this and get it to hack machines?!

    • @kristianlavigne8270
      @kristianlavigne8270 2 หลายเดือนก่อน

      Obviously we need all the desktop apps to be developed Agent First with specific agent interfaces and security built in to avoid this 😅

  • @thenoblerot
    @thenoblerot 2 หลายเดือนก่อน +1

    I👏want👏Haiku👏3.5👏with👏Vision!👏 So under rated at that price point!!!

  • @sesamring7065
    @sesamring7065 2 หลายเดือนก่อน

    The remarkable thing about AI agents is that they give us the ability to tap into the world's entire computing power for AI applications.
    When an AI agent has access to a computer, it can theoretically harness that machine’s processing power for its tasks, effectively outsourcing its computational needs. What's even more fascinating is the potential when multiple AI agents begin to interact and collaborate, amplifying each other’s capabilities.
    We also have a significant advantage now: scaling AI performance isn't just limited to physical resources, like adding more data centers for increased computing power. As OpenAI’s recent advancements demonstrate, we can now scale AI performance over time. This means we don’t need a massive global supercomputer to solve problems instantly. Instead, we can manage with less power by allowing more time to compute solutions.
    The possibilities are staggering. We're entering an era of unprecedented technological transformation.

  • @MakilHeru
    @MakilHeru 2 หลายเดือนก่อน

    Its interesting that they're doing this when Open Interpreter has been doing this for quite a while. You can hook it up to an API allowing for it to execute specific functions.

  • @michaelnurse9089
    @michaelnurse9089 2 หลายเดือนก่อน

    You KNOW you need to test it. That is the frontier for agentic models.

  • @RadiantNij
    @RadiantNij 2 หลายเดือนก่อน

    Thanks for the vid Matt. When are you gonna cover Nemotron 70b 🙏🏾

  • @hqcart1
    @hqcart1 2 หลายเดือนก่อน +2

    the new sonet is better at asking back followup questions

  • @testales
    @testales 2 หลายเดือนก่อน

    Giving it terminal access would actually be the most relevant use case in my opinion. Finally I could tell what I want and it can do all complicated configuration stuff can deal with all the errors on itself while I just lean back and watch instead of getting text walls of instruction trees thrown at me and when those fail at step 1 it all gets lengthy and complex quickly.

    • @intrestingness
      @intrestingness 2 หลายเดือนก่อน

      Text walls of instructions trees. Nice

  • @michael-jones
    @michael-jones 2 หลายเดือนก่อน

    Props for extracting the audio only 😂

  • @borjafat
    @borjafat 2 หลายเดือนก่อน

    Why didn't they mix screenshot-taking/pixel guessing with something like selenium to identify form, buttons, etc. and cross-check with the screenshot and pixel counting for added accuracy? i guess we can do this ourselves but if the model was trained on selenium usage and identifying XPATH and classes it would be very helpful.

  • @muhammadlufti2967
    @muhammadlufti2967 2 หลายเดือนก่อน

    This reminds me of Open Interpreter. I actually have been trying the computer use demo with docker, pretty much like a multi-turn conversation with the model is using tools to capture screenshots, and the computer-use and other tools do the the rest.

  • @thesimplicitylifestyle
    @thesimplicitylifestyle 2 หลายเดือนก่อน

    This is exciting! I can't wait to play around with it! 😎🤖

  • @MrVohveli
    @MrVohveli 2 หลายเดือนก่อน +2

    The life expectancy of service desk jobs just got shortened from years to months.

    • @erkinalp
      @erkinalp 2 หลายเดือนก่อน

      from multiple years to, just two 😉

  • @instatriage8859
    @instatriage8859 2 หลายเดือนก่อน

    I didn’t notice it the first time but their logging of computer actions is pretty nice too

  • @timurista
    @timurista 2 หลายเดือนก่อน

    Incredible potential for dev automation and coding boost! 🚀 Exciting times for developers.

  • @NeurodivergentTM
    @NeurodivergentTM 2 หลายเดือนก่อน

    This could actually become the next level test automation tool given you maintain your professional skepticism and review the test scripts and result.

  • @kristianlavigne8270
    @kristianlavigne8270 2 หลายเดือนก่อน

    We need an Agent First approach when developing new apps and upgrading established apps. Having agents use a human interface is just a stop gap measure 😅

  • @matt.stevick
    @matt.stevick 2 หลายเดือนก่อน

    Thx Matthew B.

  • @martenrauschenberg4831
    @martenrauschenberg4831 2 หลายเดือนก่อน

    Kindly make a video about the usage of the computerized version and then maybe compare with what the rabbit R1 can do.

  • @jamesvictor2182
    @jamesvictor2182 2 หลายเดือนก่อน

    Why don't they send a 50px by 50px image 10 times a second representing the area under the mouse cursor. It's a tiny image to process and it's content could be analysed to give precise feedback of mouse position in reference to the screenshot.

  • @liamjohnson3247
    @liamjohnson3247 2 หลายเดือนก่อน +1

    How is this different from Microsoft Power Automate ?

  • @kamerondewart7002
    @kamerondewart7002 หลายเดือนก่อน

    I have a random question about AI OS. How hard would it be to take a fork of ubuntu and add a personal assistant that has terminal access from install and voice control. I feel like a verbal UI can be made from the kind of assistance that has been around for blind people for a long time. It should be possible to make the entire OS accessible to an assistant with the correct permissions. It would not be integrated into the kernel but why do we need to move the mouse and read the screen? it has direct access to the data without visual translation. The only time it needs visual queues is when it is looking at an image or video file for information. Most of its existence is pure data. I keep seeing new projects trying to use the computer like human does. Why? I can't find anyone who can explain it to me. Why can't the LLM act like dos shell? Small LLMs are not as good as the leaders but they are good enough with current methods of automatic iteration that a human to monitor and correct could bash their way through most tasks. Maybe I should take this question to reddit. This is just where I am when it came to me. There has to be a reason because its too obvious. Someone who knows why it wont work has surely thought of it many times.

  • @Let010l01go
    @Let010l01go 2 หลายเดือนก่อน

    It's very good if we use it to maintain or repair our computers, Great E.p❤

  • @ryanfranz6715
    @ryanfranz6715 2 หลายเดือนก่อน

    I know plenty of programmers still copy-pasting from stack overflow. I’ve been trying to tell people this is coming. I’m just like… pay attention or you’re gonna get blindsided. It’s like watching people picking up seashells before a tsunami.

  • @isaacwashington2701
    @isaacwashington2701 2 หลายเดือนก่อน

    A video on the computer control would be awesome

  • @hamsturinn
    @hamsturinn 2 หลายเดือนก่อน

    2:30 Gemini pro might not lead the MATH benchmark, because it used 4 shot unlike Sonnet which was 0 shot.

  • @jimbo2112
    @jimbo2112 2 หลายเดือนก่อน

    The most telling point in the vid is that all this is just a transitional solution while HITLs are still required. The key aspect beyond this is when it's a dependable, fully automated capability, is trust. Another AI blackbox system like this - where the user just sees the results - will need security measures to match. The opportunities for data theft in this situation are as scary as the functionality is promising.
    With proprietary data becoming the key value point in many industry sectors, security provision needs to ramp up alongside AI development.

  • @karenreddy
    @karenreddy 2 หลายเดือนก่อน

    Great video. I like these lesson videos, bravo.
    I'd argue the transformative aspect of 'The Thing' is a design hook rather than a visual hook, though I'd agree the alien form mid transformation is a good example of a visual hook. It's not as strong as the narrative-design hook though, imo.
    Take care!

  • @rachwalj
    @rachwalj 2 หลายเดือนก่อน

    Please test this for us and show us how it could actually be useful in day to day operations.

  • @DailyTuna
    @DailyTuna 2 หลายเดือนก่อน +1

    You need to do a video on how this works to show us how toimplement this . We would appreciate it.

  • @MannySingh
    @MannySingh 2 หลายเดือนก่อน +1

    I like it JITI -Just in time interface

  • @felhobacsi
    @felhobacsi หลายเดือนก่อน

    Yes, please create the testing video!

  • @michaelslattery3050
    @michaelslattery3050 2 หลายเดือนก่อน

    We already have modern OSes that can more directly interact with AI: Linux-based ones.
    You can do far more with Linux on the command line than you can with Windows, mobile OSes, or even MacOS. There's a universal command, "man ", that gives you documentation. In the earlier days of Unix and Linux there was even more CLI control. If an AI OS is to be build, it should be built on an existing Linux OS, as it is already a good start, and there's tons of existing training data.
    The gap really is due to GUI software like web browsers and office suites. But even those have CLIs and APIs that an AI could talk to. Or you can stick to text formats. For example, I sometimes use Vim + Markdown (or LaTeX) to generate a .pdf instead of MS Word, as I have a ton more control. And I think that Jupyter Notebooks are superior to Excel for dealing with data and graphs.

  • @dkracingfan2503
    @dkracingfan2503 2 หลายเดือนก่อน +1

    4:06 The audio has issues

  • @nixellion
    @nixellion 2 หลายเดือนก่อน

    7:00 How it's going to be useful if it can't have access to your login information - I suppose they mean you should log in into all required places yourself inside the VM before querying Claude for actions. So it wont need your passwords, but it will have access to whatever services needed.

  • @MatthiasRMoore-ob9ni
    @MatthiasRMoore-ob9ni 2 หลายเดือนก่อน

    Actually Matthew the rabbit r1 had the lamb feature about 2 months ago which does the same as Claude in docker but dockee runs faster of course.

  • @ismetdere
    @ismetdere 2 หลายเดือนก่อน +1

    I'm waiting for your vid about Nemotron :)

  • @JoaoMSimoes
    @JoaoMSimoes หลายเดือนก่อน

    Can not stop thinking what malicious people can do with these tools like click farms, bypass those human quiz in pages, etc..

  • @ElizabethRogers-z5h
    @ElizabethRogers-z5h หลายเดือนก่อน

    How is 'computer use' different from a software program , in Python, written for me by Chat GPT 4? I asked ChatGPT to write something that will go to youtube, fetch the info i require, return to my computer, open up an Excel file and write the collected info into a spreadsheet. The Python prog it wrote for me, with the API, works perfectly, even opening a new tab when the sheet is full. I'm not a programmer/developer, so please forgive if I'm sounding naive . I would just genuinely like to know how this differs from the Anthropic ''computer use'. Anyone help?

  • @grantsigmon
    @grantsigmon 2 หลายเดือนก่อน

    A missing-file icon on your page is actually very 90s.

  • @linkup2345
    @linkup2345 2 หลายเดือนก่อน

    Of course we want you to test it dawg

  • @aamir122a
    @aamir122a 2 หลายเดือนก่อน

    Yes I think independent testing is required , do make a video about this.

  • @TechYo_Tube
    @TechYo_Tube 2 หลายเดือนก่อน

    Ai Doomer: "When Ai starts doom-scrolling, we're all doomed."

  • @hemiedwards217
    @hemiedwards217 2 หลายเดือนก่อน

    Would love to see a cat try to do that Yann LeCun, lol.

  • @kevincodes674
    @kevincodes674 2 หลายเดือนก่อน

    Lol to Claude wanting to research Yellow Stone national Park on its own. Very cool stuff though

  • @dennis4248
    @dennis4248 หลายเดือนก่อน

    I hope it will soon offer vacuum cleaner use too.

  • @cognitive-carpenter
    @cognitive-carpenter 2 หลายเดือนก่อน +1

    Matthew--first person to get personal bank account trolled by lllm