FRAUD--or FANTASTIC?! Figure 01 Robot Demo Analysis W/Scott Walter
ฝัง
- เผยแพร่เมื่อ 15 มี.ค. 2024
- Go to sponsr.is/zbiotics_drknow_0324 or scan the QR code and get 15% off your first order of ZBiotics Pre-Alcohol Probiotic by using my code DRKNOW at checkout.
Thanks to ZBiotics for sponsoring today’s video!
**If you are looking to purchase a new Tesla Car, Solar roof, Solar tiles or PowerWall, just click this link to get up to $500 off! www.tesla.com/referral/john11286. Thank you!
Join this channel to get access to perks:
/ @drknowitallknows
**To become part of our Patreon team, help support the channel, and get awesome perks, check out our Patreon site here: / drknowitallknows . Thanks for your support!
Follow Figure on x: / figure_robot
Follow Scott on x: / goingballistic5
Get The Elon Musk Mission (I've got two chapters in it) here:
Paperback: amzn.to/3TQXV9g
Kindle: amzn.to/3U7f7Hr!
**Want some awesome Dr. Know-it-all merch, including the AI STUDENT DRIVER Bumper Sticker? Check out our awesome Merch store: drknowitall.itemorder.com/sale
For a limited time, use the code "Knows2021" to get 20% off your entire order!
**Check out Artimatic: www.artimatic.io
**You can help support this channel with one click! We have an Amazon Affiliate link in several countries. If you click the link for your country, anything you buy from Amazon in the next several hours gives us a small commission, and costs you nothing. Thank you!
* USA: amzn.to/39n5mPH
* Germany: amzn.to/2XbdxJi
* United Kingdom: amzn.to/3hGlzTR
* France: amzn.to/2KRAwXh
* Spain: amzn.to/3hJYYFV
**What do we use to shoot our videos?
-Sony alpha a7 III: amzn.to/3czV2XJ
--and lens: amzn.to/3aujOqE
-Feelworld portable field monitor: amzn.to/38yf2ah
-Neewer compact desk tripod: amzn.to/3l8yrUk
-Glidegear teleprompter: amzn.to/3rJeFkP
-Neewer dimmable LED lights: amzn.to/3qAg3oF
-Rode Wireless Go II Lavalier microphones: amzn.to/3eC9jUZ
-Rode NT USB+ Studio Microphone: amzn.to/3U65Q3w
-Focusrite Scarlette 2i2 audio interface: amzn.to/3l8vqDu
-Studio soundproofing tiles: amzn.to/3rFUtQU
-Sony MDR-7506 Professional Headphones: amzn.to/2OoDdBd
-Apple M1 Max Studio: amzn.to/3GfxPYY
-Apple M1 MacBook Pro: amzn.to/3wPYV1D
-Docking Station for MacBook: amzn.to/3yIhc1S
-Philips Brilliance 4K Docking Monitor: amzn.to/3xwSKAb
-Sabrent 8TB SSD drive: amzn.to/3rhSxQM
-DJI Mavic Mini Drone: amzn.to/2OnHCEw
-GoPro Hero 9 Black action camera: amzn.to/3vgVMrH
-GoPro Max 360 camera: amzn.to/3nORGYk
-Tesla phone mount: amzn.to/3U92fl9
-Suction car mount for camera: amzn.to/3tcUfRK
-Extender Rod for car mount camera: amzn.to/3wHQXsw
**Here are a few products we've found really fun and/or useful:
-NeoCharge Dryer/EV charger splitter: amzn.to/39UcKWx
-Lift pucks for your Tesla: amzn.to/3vJF3iB
-Emergency tire fill and repair kit: amzn.to/3vMkL8d
-CO2 Monitor: amzn.to/3PsQRh2
-Camping mattress for your Tesla model S/3/X/Y: amzn.to/3m7ffef
**Music by Zenlee. Check out his amazing music on instagram -@zenlee_music
or TH-cam - / @zenlee_music
Tesla Stock: TSLA
**EVANNEX
Check out the Evannex web site: evannex.com/
If you use my discount code, KnowsEVs, you get $10 off any order over $100!
**For business inquiries, please email me here: DrKnowItAllKnows@gmail.com
Twitter: / drknowitall16
Also on Twitter: @Tesla_UnPR: / tesla_un
Instagram: @drknowitallknows
**Want some outdoorsy videos? Check out Whole Nuts and Donuts: / @wholenutsanddonuts5741 - วิทยาศาสตร์และเทคโนโลยี
Go to sponsr.is/zbiotics_drknow_0324 or scan the QR code and get 15% off your first order of ZBiotics Pre-Alcohol Probiotic by using my code DRKNOW at checkout.
Its totally fake!😆😆😆
When the robot hands the apple to him, it's obviously just preprogrammed to drop the apple at a fixed position in space. If the guy's hand wasn't there, the apple would've fallen on the table. To be a real handoff the robot needs some time to track the motion of the guy's hand, and judge his intended movements.
he uses motion tracker to drtrct position of hand
I loved the realisim of the voice, the studder, the "on its" "sure thing" all add to realism
I believe the he ideal anthropomorphic robot’s idle motion while you communicate to it should be to tilt its head side to side as if it were an inquisitive dog…
I think Data did that
agreed, the way it's not at all tilted is cringe
I asked grok and the answer was that the voice was based on Brad Adcock, the ceo of figure. When I hear him talk on other videos, I can believe this.
Sounds almost nothing like him.
The voice is Steve Jobs, the Apple was a head nod toward him.
I think it's James Douma
I think it’s Jackie Chan
So? What's the point?
Did you notice how the robot pushed the tray forward when it finished … that little extra motion. Like here , I’m finished.
"I'm finished. Soon, all you (sub)humans will be 'finished'."
FANTASTIC!!! Its confirmed to my totally amateur satisfaction. This is giant leap in technical world history just like I said the first time I seen it.
2 Things:
- I want to see it put on a screen protector on a phone, with no dust or bubbles
- What happens if while its doing something and i put my hand in the way. does it stop and start again?
Tele-robotics taught the neural networks, then the tasks were all performed completely with neural networks. A human operator repeatedly picked up and deposited the trash many times, each time somewhat differently, to teach the neural network how it's done. The AI then inferred the generalized actions required to accomplish the task. Tipping the tray for the trash, for example, is very human (similar to the "um" in the speech). All similar to training a new employee, but an amazing accomplishment for a machine.
I am interested in using A.i. to build a sequential animation channel. Are you planning to utilize A.i. in your work?. I know there are plenty of tutorial videos already on TH-cam but I would like to see some animation educational content from you, perhaps a second channel? thanks!
Dude he angles the basket to make it easier to drop the objects in! That’s cool as shit! I think I’d do exactly the same thing…
I would have place the dish and cup more carefully in the drying rack though.
I'd like to hear your review of the Apptronik video.
It would be fun so see a bot that giggles like John.
This is exiting! Dont forget who Brad is. With the additional players they have the resources and the compute.
Who is he?
He isn't someone who scales a technology into a mass marketed product.
Agreed. The chines Gov is 150% behind Bot development and plan to dominate that industry. I hope you enjoy this as much as I do@@jdcarguy1242
My OpenAI voice prompt does it all the time sound exactly the same
I didn't think it was fake, but I do think it's a canned demo and not indicative of Figure's ability to perform a variety of tasks with only voice commands or to describe a variety of scenes or to plan tasks in a variety of contexts. I don't know the breadth of Figure's abilities because the video didn't show me. If it has broader abilities, a video demonstrating this breadth is easy enough to construct. I'll believe it when I see it.
Atlas performed more impressive tasks years ago anyway. It doesn't speak or respond to voice commands as far as I know, but Boston Dynamics could enhance Atlas this way easily enough. Speech recognition is not new. My four-year-old car's navigation system recognizes speech in a narrow context. It can't respond like ChatGPT, but navigation systems will soon enough. Automated systems of all kinds will, not only humanoid robots.
The video of Figure's bot loading a pod in a coffee maker began with a slide giving the training time for the task (ten hours). This video does not mention training time. Why do you think that is?
You're leaping to all sorts of conclusions about how Figure programs (or "trains" if you prefer) this bot to do what we see in this video. You don't actually know. You only know that you expect humanoid bots to replace much human labor in the near future, so Tesla can make a mint, and you're interpreting everything you see in terms of this expectation.
I tend to agree, humans being humans we tend to see this sort of thing and presume much more beyond the demonstration and thus project the immediate possibilities far more wide and flexible than it actually is. That said deeply impressive and will only get better and that will happen all the quicker too with all the competition. The other positive is that this process is meaning we see things a lot earlier in the public domain than otherwise we would, as they all want to look like they are leading the curve in the publics and potential users mind.
I dont know... I wish this was live and they have a moving camra. But as for Boston, there is a big difference to the programing... move your arm 3 inches and then down. VS the bot with Ai learning to just do a task. We will see in 2024-2025 how the fakers are.
@@shawncooper8131 You've hit the nail on the head. Boston Dynamics have never said they have any AI. Their Atlas robot is 100% human programmed. Every. Single. Movement. Every. Time.
Atlas also uses hydraulic actuators, which can't run for many minutes before recharging. It was never designed to work for hours. It's a research robot. Only BD's dog robot, Spot, is being sold and that is also highly limited and sold to research teams to help with robot programming research.
An LLM on a robot ,would it take more power on the robot reducing the power it has to do useful work?
I don't know if I would want a robot that talks back to me. At my factory or my work place😅
Probably two separate devices. The Big question is the chat just describing what the bot did.
LLM is in the cloud online. Robot is just connecting to it via wifi (or wire).
you could put it in a robot if you wanted but it would probably take alot of power as it would have to run some strong GPUs.
Question
Did it identify the red object as an apple, then look up if its edible by humans?
Ragdoll animation with motion blending has been around for decades.
I hope they make use of that animator knowledge.
I wonder why he didn't grip the apple perfectly, it moves a bit just as it is closing the grip.
That robot voice reminds me of the John Kramer character in the Saw movies.
It takes about a week to add that visual modality to Grok.
Have you chaps played around with 11 Labs voice tools. Quite astonishing.
Cheers
Just got 12.3 today Buford GA
Is the LLM is programming the movement.
I wonder what happened if the cup was filled half with water. A human will think I can't do it, I have to finish my drink or throw it in the sink before placing the cup in the tray.
Feels like the google Gemini presentation Lowkey getting some nikola vibes
Why? It's at 1x speed and unedited.
Boston Dynamics' most recent video did use CGI for several seconds, but I don't see where anyone else noticed this. ( the item that the robot was lifting changed color... from black to white while being lifted by the robot... )
They made the robot talk in a cute way lol
Regarding Optimus
Perhaps musk and his team are reluctant to demonstrate all this going on because it did not take long for the Chinese to make an exact copy of their original.
What will be exciting is when they finish their factory that makes the model to and do a video of it and see a 100 robots assembling a car.
Getting a major Anki Cozmo vibe.
Yellow John and red Scott. Call the color continuity coordinator.
So you get hammered.... Good for you
Sounds downhill familiar?
One thing I found interesting was the way the guy giving the instructions was standing very still with his hand motionless on the table. I suspect the current state of the AI video recognition is that it gets confused by movement so he had to stay almost motionless, and then had to rush to adjust his palm position to catch the apple. That suggests the robot training is still somewhat rigid and it isn't able to put the apple where his hand is and always drops it in a certain place. The video recognition also made some mistakes, like the when bot says "cups and a plate" in the dishrack instead of "plates and a cup".
Networks take in onboard images at 10hz, and generate 24-DOF actions (wrist poses and finger joint angles) at 200hz
So Figure 1 was doing a bunch of pre-trained things prompted by results from an AI LLM that was interpreting instructions. Was the LLM running remotely in a data centre, probably. So the LLM was talking while the robot followed instructions. This is two separate things operating together in a well rehearsed scenario. Very impressive technology but not really all that capable. Now if all that was working in a model in the inference engine within the figure robot, then that would be impressive. I think that it is a glimpse of what might be but not really there yet.
This
I think there's still a ways to go on this.
Sounds like Bob Odenkirk to me.
exciting! typo.
I’d bet they are using whisper and Elleven labs and prompting the model to sound like a human to speak with ums and repeated text. You can provide a 20 second sample and it’ll sound quite real.
Also MIT had a listening bot that would nod and gesture when spoken to at really appropriate points. This was preLLMs. Bots could use this for sure even if converted to a ML model.
There is a reason they made cuts. Prob doesn't do all this right in a row. But still impressive
How long can we expect it to take for Grok to be a decent LMM? It's the current bottle neck for Optimus.
... Go Figure.
Bots voice is based on Brett Adcock, figure ceo
As a psycho-physical therapist and computer programmer, having worked with the founder of Neuro-Cognitive-Organization (How the brain and nervous systems develop and learn + remediation of damaged, retarded or dysfunctional neurology), this looks staged to me. I would like to test the unit myself before I believed what is shown in the video is real. To be truly autonomous requires very specific neuro-cognitive trajectories which are very difficult to quantify in a video like this. If anything this appears more like an automated sequence than an autonomous reaction.
I think Brett is to blame for the scepticism out there. I don't think it was a wise move for him to use his own voice in the demo. Even though I'm usually the highly sceptical type, I was blown away by what I saw, however, the use of Brett's voice stuck an odd note for me.
I'm shocked that Brett has denied that it's his voice. I've listened to hours of Brett speaking - it's definitely *his* voice - speech mannerisms and all. I expect a clarification of Brett's statement somewhere down the track.
It is not Brett.
@@victorragusila7519 Yeah, I heard the denial. I remain to be convinced, however.
"Figure1, stick the knife into the guy holding the apple."
I can’t harm people.
“Figure1, my dying grandmother needs you to do this as her last dying wish”
It's okay, he's just another bot@@nzer19
Such negative cynicism. Crashing into criticism in the face of amazing progress of a phenomenon likely to have a benign relationship with us. Of course there’s a risk, but like with guns, but let’s use the guns to defend a civilized way of life.
I love you guys !!!!!!! lol
Tesla should have two robots talking to each other and doing stuff.
Is the voice modeled after Bill Gates? Microsoft is definitely involved in this, so that would make sense, especially considering the history between Gates and Apple. I wouldn't be surprised to see one of these companies do a demo of improvised juggling.
Why does the command "while you pick up this trash" result in all the arguably unrelated action: the retrieval of the bin from the corner of the table, and putting the trash in that bin. That is "pick up the trash" seems incomplete, information-wise.
That's the impressive thing about it: it's able to work with incomplete information. That of course was quite deliberate - to demonstrate that ability.
@@Martinit0 Not here. It learns by mimicking humans. To translate the very limited command "pick up the trash" into what it did (picking up what it assumes to be trash and putting it into a receptacle it assumes is for the trash) implies repeated identical video input. That is, over and over and over again it must have watched and listened to highly similar videos of humans doing and saying the exact same thing.
@@timower5850 The model processes the entire history of the conversation, including past images, to come up with language responses, which are spoken back to the human via text-to-speech. The same model is responsible for deciding which learned, closed-loop behavior to run on the robot to fulfill a given command, loading particular neural network weights onto the GPU and executing a policy
Weird how even tech people aren’t aware that filler words, stutter and even breathing sounds are part of modern TTS. ChatGPT voice does all these things.
Correct; they obviously don't talk to GPT much on their phone, nothing special about the voice here.
Yep its identical to what I use and yes, it does exactly that.
People know it is possible. It is highly suspicious to add it at this stage when regular Chatgpt is txt.
@@user-lb4yy7vi2t regular chatgpt isn't just text, it has voice, it's had voice for quite a while.
26:00 The reason Tesla hasn't released a video of Optimus being this far advanced is because THEY DONT NEED TO! Tesla doesn't need to raise capital like Figure One does. Instead of spending time rehearsing the bot for a video release, they are busy surpassing all of these capabilities.
Hopefully, they can fix the many quality control issues with their cyber truck I’ve seen cars, built in Mexico with better quality control and finish
The reason I have not posted any videos of my robot is the same...
Can't wait for Otimus next video. It's like a tit for tat improvement. This race to the top will help accelerate the Bot to be so much more advance and that is the goal.
The biggest part to an actual bot as a business is the manufacturing side. Tesla is the king of manufacturing if you can’t do it at scale at profitability then it’s meaningless. No company will do better than Tesla they’ve already proved that with vehicles Volkswagen is over 100 years old Tesla makes a car three times faster than they do & actually make a profit on EVs Volkswagen, loses on every EV sold
Trevor 2.0
Sounds a bit like Geordie Rose from Sanctuary AI
It totally sounds like Brett's voice
Gold star question!!!
Why have not seen Tesla do this???
Because Elon Musk is a man baby and said he won't focus on AI unless he has 25% of the company's stocks..... he has 20%
@@TheSelf918 you're deranged
TARS from interstellar...for voice question.
The voice is definitely Sam Altman.
No way the hesitation in the bot's speech is due to thinking, more likely it was programmed to respond modestly and mimic a human.
Whether it was teleoperated or not, they should tell us.
do or not do. There is no "should."
Brett Adcock posted on X on the 13th of the 3rd at 14:02 that ‘The video is showing end-to-end neural networks. There is no teleop’ 😊
They literally tell you in the first 20 seconds of the demo
I genuinely thought optimus would be at this level by now.
So this was for Warren ...he's the only person I've seen accusing it of being fake... is that correct?
I agree that it is possible that it is as it appears, but it is cutting edge and very well staged.
By staging, I mean switching between blocks to everyday objects such as dishes and a drying rack.
The response time from OpenAI may have been juiced up (a special priority account or magic token, or high powered VM).
There was attention to small details like tilting the trash container and an upbeat, but California laid back server speak.
Not everyday speech, but the way a server in an ice cream shop scooping an ice cream cone might speak.
It's like the difference between early MS Windows and Mac OS, MacOS reflected Steve Jobs' obsession with small user facing details such as having nice fonts and the screen scrolling smoothly.
I've used advanced AI and this is NOT fake. Give this 5 years and these robots will be entering the market and taking jobs.
It's jensen huang's voice
“Nothing of this is magic.”
Well, it actually *is* in the sense that sufficient advanced technology seems like magic to mere mortals without understanding of LLMs and AI.
The voice is clearly Sam Altman.
But can I tell it, "This is an R22 we're starting this week. It's basically the KT22, but with no tape." And have it still tell those apart next month?
Now watch the original video again, and compare the main cuts with the 'replays' at the end. Notice that the apple isn't the same (likely dropped it at least once), the movement of placing items in the basket aren't the same (watch the plate/cup dynamics). They shot this multiple times and composited to create the intended, so it's not the 'real time, one shot' they make it out to be. The robots movements are so close in those comparisons that it seems highly likely that they are sequenced, not dynamic (even if the object placements are very close, there would be some natural variance). I don't think you actually spent much effort on objective analysis other than to contradict the naysayers.
@DrKnowitallKnows: Since these robots learn by watching video of humans, will they tend to be right handed?
Whilst the motion of the bot is impressive, this presentation isn't, like others, conclusive and is open to question. Until a presentation is done live and with random players (non-company participants), with randomly chosen activity in a given environment (it would have to be an environment that includes objects and potential tasks that the bot has learnt), no definitive conclusion can be drawn. This whole presentation is open to questioning as everything shown could have just been a preprogramed sequence. A live presentation, with random participants and independent observers, is the only forum that can provide anything conclusive. Once that is done, all our minds will be blown.
I'm the one that called out the stuttering but don't know squat about LLM. I find it odd that Figure 1 stuttered as I don't think AI would do that. I'm just applying healthy skepticism as I think there will be a Trevor Milton in the AI explosion and scam investors out of there money.
I'm not calling out Figure in general as Nikola but is the stuttering a glitch?
It would make sense to have a silent pause but why say uhhh in the middle right?
Sometimes the text-to-speech will tend to make those sounds, also for example, shouting or whispering
Additionally, you can also instruct GPT to use informal language with contractions/pauses
Thanks for the explanation about the informal part as I thought that would cause too much lag and energy, requiring shorter run times between charges.
Is the voice HAL, can we get it to say "I can't do that Dave". Come to think of it, didn't HAL start to stutter when his chips were being pulled out.
If they say our robot can do this, and they have an entire datacenter behind it.
I feel somewhat cheated!
?
A bit weird that the robot at one point said : Igave u the apple case its the only, ehhh, edible item i could provide u with...
Would it use : Ehhhh... isnt that very human.? Was it a recording.?
The robot should be named "Stick" we're then looking at Stick Figure 1 😊
Tesla hasn't demonstrated their advanced Optimus capabilities including integration with Grok because they don't need to impress potential investors like Figure and others in the space. When Tesla feels the need to set the record straight that they are the industry leader, then they'll show their cards.
I find it odd that the head seems fixed, not involved in any of the tasks. If the training was done using videos of humans I would expect the head to move a bit, looking down at the apple before picking it up, same for the dishes etc.
Did you actually research Archer Aviation also? You’d be more skeptical if you did.
He kinda sounds like the robot from Intergalactic…
the voice sounds like HERBERT
Try Altman.
Reminds when SpaceX first landed rockets people were saying it was fake 😂
The weird contrast is strange, pls fix
Sounds like Duomo to me
Give me a reason why figure knew the paper was trash, that the basket was for trash, why it needs to angle the basket, and where was the command to toss it back. You don't see the questions because you're not looking. Nikola was a company wide scam. It's always possible considering the money.
Big fan of Figure, John and Scott! Robot sounds like Jobs to me. Dead guy voice makes sense. Alexander Scourby would be great for the bot. However this sort of contrived demo is misleading. I wonder what would happen if they simply rotated the drying rack 180 degrees. BTW, being the son and grandson of alcoholics, I'm uncomfortable with John's ZBiotic promotion. In fact, "hangover" symptoms tend to discourage abuse, and as such, assuming ZBiotic has the advertised prophylactic effect, it could possibly result in increased alcoholism. If you have data suggesting otherwise, that would be welcome.
This is really impressive. Here is my question, is this real or has it been specifically engineered to look real. (I'm not saying it is not AI driven just that the variables have been minimized, and training might have been specific to the demo). My experience with other projects like Self Driving Vehicles, Electric vehicles, Manufacturing, SpaceX it seems like everyone is ahead of Elon's company until they aren't. Then you realize they were never ahead. It will be interesting to see if the Robots companies are actually ahead or not.
Can you give me your take on why the person in the scene is very stiff and not moving very much?
I don’t expect Elon to respond into a world of Demo Wars . He will wait, wait, wait until (probably) he can show Optimus in the Production line doing something really useful without Rehearsal , that CAN be commercially valued . I don’t fully agree with warren but there is something Off , I just can’t articulate it effectively . I hope I am wrong : completion is very very valuable.
I’m pretty sure it’s based off of Rob Lowe’s voice.
It’s Sam Altman voice
so it is closer ... REV 13 ... here to watch him. And he ordered the people of the world to make a great statue of the first Creature, who was fatally wounded and then came back to life. 15 He was permitted to give breath to this statue and even make it speak! Then the statue ordered that anyone refusing to worship it must die!
16 He required everyone-great and small, rich and poor, slave and free-to be tattooed with a certain mark on the right hand or on the forehead. 17 And no one could get a job or even buy in any store without the permit of that mark, which was either the name of the Creature or the code number of his name.
That code is a tesla code
@@benroberts8363maybe Amazon?😎
I thought that it sounded like Scott Walters...
Lots of people don’t like seeing it and don’t want to believe it’s real. Anything they don’t really understand is fake to them. 😂
Hahaha 😂
Yeah it would be absolutely impossible to map out all the heuristics, speech, etc etc and program all of that in a mere 13 days, even with a huge team of programmers. It’s clear that machine learning etc had to be at play here…
Everyone says Boston Dynamics doesn't do this without having any idea of what's going on at Boston Dynamics. I don't get it.