I asked it to go to Dell's site and configure a good laptop for my engineering class, keep the cost below $2000. It went there, configured an XPS with the options I asked for. Final price $1499
Kris - you are a most wonderful and nice person to all of humanity with your testing. I am very grateful. Thank you and please keep pushing the envelope! 😊
Remember the Rabbit R1? lol less than a year from now anyone will be able to deploy something similar that actually works. You're expression was like a kid who just got what he wanted for christmas haha. Great video as always!
Yeah the rabbit software was a good idea.They just never really implemented it. It'd be cool to be able to see something like this using phone apps now.
I did a much much smaller test with that and it was about 200k token, which was about 0.5$ So I see how this is going to be a very very expensive service to use
I used it to find the contact number of a company, relatively short request and it was about $0.02. I've played around with it for about half an hour, trying things and it's cost me. £0.63 so far. Not too bad.
Makes you wonder what will be our computer interface in just a couple of years. There's no need for us to know anything about the files and folders. We just ask for things and the results appear in front of us.
I guess Meta (zuck) did the right bet with getting their VR sunglass out soon . I mean soon we can just talk with this agents on the fly and through VR sunglass
I tried to use it to write some code on a platform that I use, it stops every few minutes complaining that not enough token. It ended up consume the whole 1M I have in a day and still haven't got the print hello world right on that platform.
And if you add a program to listen to the user's voice and then perform those actions you have a computer that can listen and create! The future is awesome!
Directly from Brazil. Great video brother! With this great rise of A.I. Which profession would you recommend that is on the rise in the United States and do you see favorably?
Claude starts with a remark about access and use (hijack) while you have Claude perform a task. Does no one else find this a problem? It seems to me a goldmine for hackers to hack the process during a task. A kind of holy grail.
Soon there will a new wave of operating systems, with voice control only. Think Iron Man and Jarvis AI. Especially with augmented reality glasses as the user interface.
I have this issue I cant fix while setting mine up, AI cant fix it either. been HOURRRRRSS... (HTTP code 500) server error - unable to find user computeruse: no matching entries in passwd file
@@sLavoncheg Giving it access to your computer is dangerous, it's very early stages, it may not always follow commands correctly, could leak personal data, cause harm to your computer etc. Currently it only runs on the Ubuntu instance anyway. This is no where near a finished product. You'll also struggle to do much with it as there's very low usage limits at the moment.
Can someone explain what the big hype is about? There are python libraries that already exist that does the clicking/opening apps/etc... and there are already agents for the past couple month. What is this about?
costs the same as a claude api call, shouldn't be too expensive actually because there is not so much text output, although reading images might be slightly more expensive
Interesting how it fails on the chess task - it believes it plays e4 both times when it actually plays f4, and still thinks it has played e4 after the second ss, then just implodes when it tries to capture on d5. It calls the variation on the first game the Dutch, which is true if the first move is f4, but e4 d5 is the Scandinavian, so it's weird that it uses the name for f4 d5 when it seems to think the position is e4 d5.
If anthropic could just provide an API where you provide a screenshot and a description of the item you want to click on and it returns a set of co ordinates, that would be amazing. I've been digging through the computer use code but it's far from straightforward.
Ugh, so I wrote an extensive lib and apis do all this, as the interface to 'anything' was missing in LLMs... and now Anthropic went and created a whole system... (edit) never-mind. I just tried Anthropic's system, it's shit at parsing arbitrary web pages and 'seeing' and understanding them, their whole approach is wrong, they don't understand how to abstract web page info into a way that's useful and expressive for an LLM to understand, even compared with my toy web page abstraction system, how did they screw this up?
Very cool, but this is nothing new. They may refine it a bit, but the idea is not original. I'm glad seeing this working as good as you show, but i have mixed feelings.
Damn bro this is insane and the stealth release: Claude is what open AI was before the Altman coup
altman coup is what the world was before the ai coup
OpenAI is gearing up for a launch in a few days, would be interesting to see their response to computer use
@@ryangreen45do you know exact date ?
Yeah but we don't care about any of this, we just want Scarlett AI girlfriends.
@@ryangreen45 why do you think openai is planning a launch soon? just curious :)
I asked it to go to Dell's site and configure a good laptop for my engineering class, keep the cost below $2000.
It went there, configured an XPS with the options I asked for. Final price $1499
Bruh was this paid..do we need to pay for the api key or something
@@HemanthTerli of course.
Wow, you're quick with the content. Love it. Looking forward to seeing how you feel the coding has improved (if at all) with the new sonnet
It has improved greatly for me
On the Anthropic blog post, it states Claude 3.5 Haiku comes out to everyone later this month...which is like 9 days or less
Kris - you are a most wonderful and nice person to all of humanity with your testing. I am very grateful. Thank you and please keep pushing the envelope! 😊
Getting a good idea of api costs and if it varies by tasks or directions would be great to test.
Remember the Rabbit R1? lol less than a year from now anyone will be able to deploy something similar that actually works. You're expression was like a kid who just got what he wanted for christmas haha. Great video as always!
Yeah the rabbit software was a good idea.They just never really implemented it. It'd be cool to be able to see something like this using phone apps now.
Apparently Humane AI pin and Rabbit R1 are both working on a app
Amazing! Bro, I didn't expect this development. OMG, I still can't believe it. This is like a UiPath automation app... incredible!
add realtime voice control to take it even further
How many tokens did it all cost?
I did a much much smaller test with that and it was about 200k token, which was about 0.5$
So I see how this is going to be a very very expensive service to use
Yes but think of stuff you would not want to share with workers? Or VAs this could do it for you and your data and idea is safe bot leaked :)
This is the main question after one is in awe of the service.
I used it to find the contact number of a company, relatively short request and it was about $0.02. I've played around with it for about half an hour, trying things and it's cost me. £0.63 so far. Not too bad.
These are the last days of GUI and web. Ali to database is the future.
Makes you wonder what will be our computer interface in just a couple of years.
There's no need for us to know anything about the files and folders. We just ask for things and the results appear in front of us.
I've been pondering about this a lot. Lots of redundant things happening if AI becomes powerful enough. GUIs may become "legacy mode".
I’d love to see a comparison of this to what you can do with open-interpreter these days.
cant wait to spend more time with family, friends, my community. im pro work but time is more valuable than anything else
I guess Meta (zuck) did the right bet with getting their VR sunglass out soon . I mean soon we can just talk with this agents on the fly and through VR sunglass
The SWE-bench for Haiku 3.5 got me really excited, it beating the "old" Sonnet 3.5 is insane.
Finally, this is the equivalent of giving a robot hands. Anthropic truly leading the way
Very cool. Thanks for sharing so quick!
This is insane... And brings us another step closer to "Her".
I tried to use it to write some code on a platform that I use, it stops every few minutes complaining that not enough token. It ended up consume the whole 1M I have in a day and still haven't got the print hello world right on that platform.
And if you add a program to listen to the user's voice and then perform those actions you have a computer that can listen and create! The future is awesome!
mindblowing, ...good to know that A.I. will take over my work in the future, so I have time for more useful things :D
Looking cool, hope can find a way to use in practice
Directly from Brazil. Great video brother! With this great rise of A.I. Which profession would you recommend that is on the rise in the United States and do you see favorably?
we need longer videos ❤
Do you think this is a step towards AGI?
Your channel is not just a place for entertainment, it is a source of inspiration and wisdom. Thank you for your creativity and diligence!🌇🐌🌙
Amazing demo
cool, nice examples. Thanks!
Claude starts with a remark about access and use (hijack) while you have Claude perform a task. Does no one else find this a problem? It seems to me a goldmine for hackers to hack the process during a task. A kind of holy grail.
Very good new and very good tests!
Now need llama to implement this, and have a ready made docker with voice interaction to control my computer
test it on Mac :) - I will wait for few days to gather other people feedback than I will try it:)
This is incredible!
So impressive!
Soon there will a new wave of operating systems, with voice control only. Think Iron Man and Jarvis AI. Especially with augmented reality glasses as the user interface.
it is fun to use but bit costly
The server 127.0.0.1 rejected the connection. What could be the reason? It is not showing the desktop; I only have the chat on the right side.
I have this issue I cant fix while setting mine up, AI cant fix it either. been HOURRRRRSS...
(HTTP code 500) server error - unable to find user computeruse: no matching entries in passwd file
missing the voice to voice option, hope it comes soon
I'm watching this and just feeling the AGI...
Rate limiting is a bit annoying. Why can't it continue after the rate limit has reset / cleared? I hit it nearly every time.
I get rate limited quickly when using it, how do I fix this?
Is it possible to run this feature directly on my machine not in browser window? E.g. open VS Code, etc. ?
If yes where I can find guide?
You don't want to do that. Really.
@@hippopotamus86 why
@@sLavoncheg Giving it access to your computer is dangerous, it's very early stages, it may not always follow commands correctly, could leak personal data, cause harm to your computer etc. Currently it only runs on the Ubuntu instance anyway. This is no where near a finished product. You'll also struggle to do much with it as there's very low usage limits at the moment.
Can someone explain what the big hype is about? There are python libraries that already exist that does the clicking/opening apps/etc... and there are already agents for the past couple month. What is this about?
Make a vid getting it to do all the stuff Rabbit R1's LAM was supposed to do
Damn tbh, this was pretty cuul
how much does it cost? this must be very costly, and people are going to get a surprise bill after playing around with it for an hour.
costs the same as a claude api call, shouldn't be too expensive actually because there is not so much text output, although reading images might be slightly more expensive
Could it be run to use windows
10/11 ? I want to steering a windows apps.
Interesting how it fails on the chess task - it believes it plays e4 both times when it actually plays f4, and still thinks it has played e4 after the second ss, then just implodes when it tries to capture on d5. It calls the variation on the first game the Dutch, which is true if the first move is f4, but e4 d5 is the Scandinavian, so it's weird that it uses the name for f4 d5 when it seems to think the position is e4 d5.
If anthropic could just provide an API where you provide a screenshot and a description of the item you want to click on and it returns a set of co ordinates, that would be amazing. I've been digging through the computer use code but it's far from straightforward.
Osr it’s been a thing
Ugh, so I wrote an extensive lib and apis do all this, as the interface to 'anything' was missing in LLMs... and now Anthropic went and created a whole system...
(edit) never-mind. I just tried Anthropic's system, it's shit at parsing arbitrary web pages and 'seeing' and understanding them, their whole approach is wrong, they don't understand how to abstract web page info into a way that's useful and expressive for an LLM to understand, even compared with my toy web page abstraction system, how did they screw this up?
That sounds amazing, do you have a public git repo or anything i can play around with hahaha
I keep getting rate limit errors after a few consecutive actions, noticed you didn’t?
Looks and feels like gimmicky garbage
Why are you using that VM? It’s way better on normal Mac.
Speed run clicking on one of those scam links and giving up banking details
I'll use Claude to use ChatGPT so I dont have to.
Can this automate comfyui?😅
Basically it's the same as AGENT OS in Open Interpreter 😅
amazing
i'm programmer, now i'm scared!
Its good way open your comp for hacking ai
quitting after 1 move in the chess game?!? lol so funny
Lol, it played against @AnnaCramling 's mother
Time to automate runescape even more so.
Anyone else runs into rate limits?
i need to pay more ?
Very cool, but this is nothing new. They may refine it a bit, but the idea is not original. I'm glad seeing this working as good as you show, but i have mixed feelings.
I love it. I hate spreadshITS. This is going to do the boring work, while i pump my muscles at the gym.
Yeah! use this SpreadShit 😅
when it gets better. Make money for me and leave for work lol :d