I loaded our organization's transcribed video training curriculum (64-pages) into Notebook LM and it came back with a remarkably succinct and entertaining 18-minute podcast in about 10-minutes. This now serves as the intro to our curriculum. Stunning is an overused word in this arena but I was, uh, a bit stunned.
Yeah no offense but that was not a good way to describe the orientation of the table. Specifically the phrase "bottom-right top surface" is quite ambiguous.
I was beginning to wonder whether _I_ was a model when I kept getting confused by what sounded like "One corner of the table is touching his should, and the opposite corner is touching his _right angle."_
It is so unfair that your subscriber count does not grow faster. You are by far the best informational source on the internet about AI. But please know that it is so appreciated that you don't turn to super clickbait tittles etc! When my finances is a bit more stable I will 100% be a part of your patreon!
vids are too long, it's a choice of a format, I wish he had like a 2 minute summary of the long format videos (which are still great, but you don't always want to spend the time to watch).
If you draw it out on a page, with him standing, it is virtually impossible to draw without it being heavily tilted, but yes, could have picked an easier example. Just wanted it like the one he gave.
@@aiexplained-official Test this on two or three acquaintances of average intelligence and see their responses. I suspect they will struggle to parse the confusing setup as well.
@@aiexplained-officialhow would he even place the cup on the table if it’s that tilted in the first place much less reaching over to pick the cup up. It’s literally pressing against his shoulder, wouldn’t it be too heavy and motion restrictive to literally do anything? Like I’d definitely pick the wrong answer based on how confusing the wording is. If cup stays on, strawberry stays on, that’s how I thought about it
I think I will have to release a dozen public examples of Simple to show how even easier questions can fool it. After even more testers, human average still hovering close to 90%
Yes, I couldn't imagine how Jerry is positioned and haven't realized that the table is tilted (and it shows that even as a human, a more complex sentence structure can be hard to understand).
He was standing, remember, so the table is resting against him, as if he is transporting it somewhere. If you draw it out on a page, with him standing, it is virtually impossible to draw without it being heavily tilted, but yes, could have picked an easier example. Just wanted an example that was very similar to the one OpenAI gave - not easy!
The answer seemed obvious in retrospect but stumped me. I think we’re actually very close to AGI and possibly already there if you consider most humans just ain’t that bright
Best AI news channel on TH-cam. Literally how I keep up with the latest in the AI space. Everybody else is just clickbait with god awful surprised pikachu thumbnails and declarations that agi has arrived. Thanks for your high quality, rigorous, and honest reporting.
Wow that notebookLM thing is gonna save me a lot of time when starting to study something, to get like an overview of what I'm gonna study, then studying is generally so much easier.
I didn't imagine the table tilted from that question, but Jerry being bent over, sort of leaning on the table and lifting up his foot to reach the corner of the table's top surface.
@@aiexplained-official I was also imagining this - sort of like a snooker player. TBH I found that section incomprehensible - 'bottom-right top surface' almost sounds like an oxymoron.
@@aiexplained-official personally I also thought you were trying to trick the AI by spilling in nonsense about the table's corners. I personally would have failed that question if it was provided to me, and I'm glad I'm not alone on that failure.
NotebookLM is crazy. In less than 5 minutes I had a 2 person 10 minute podcast that was basically unrecognizable as AI describing 4 web pages and 2 pdfs. It 8s legitimatepy insane.
@@tomikexboii5403 It is mainly a summarizer (though I think it can also draw on other knowledge that the LLM might have), and can add various data like sites and pdfs as well as notes you write. But besides querying it rag-style, it also generated a two-person podcast style conversation. Which is pretty much indistinguishable as being AI
Governments: "5 Power plants EACH. Totally, completely, absolutely mad! We'll nev.." Tech Industry: "It powers a submissive god." Governments "So when do we get started?"
I'm glad you'll be refining that prompt based on feedback. But I wanted to point out that with all the uncertainties, I think it's a point for the model that it recognized the table would be at an angle, AND understood that lacking more data, it was prudent to assume it stayed where it was, considering many strawberries tend to have one or more semi-flat sides.
Depends, time and money will tell. People who call for support (usually elderly people) hate being handled by bots of any kind (will hang up and leave a bad review). And call centers which do sales cold calls? Well, as soon as the receiver detects that he's talking to a bot (which is super easy right now. Advanced Voice is a lot better than previous solutions but the voice + tone + grammar is still very artificial.) he'll also hang up so it depends on the sale success rate whether companies will be interested long term.
AI will replace ALL of the tier 1 support people. And AI will study/learn from every call that has to be escalated up to a Tier 2 support (actual) person. As AI learns, fewer and fewer calls will need to be escalated, and then layoffs of Tier 2 people will begin.
I remember the term "Digital Divide" being used for those in developing nations with access to internet and those without. I guess this is the 2nd Digital Divide
Jerry could just be lying down next to the table. As both o1 and gemini say, the table's position could reasonably be considered a red herring or irrelevant information, especially since earlier in the story you strongly suggest that Jerry successfully places the cup on the table. You don't "place" items on sharply angled surfaces, you'd use a different verb like "hold the cup up to". I don't think your question is as simple or obvious as you think.
NotebookLM is incredible. I think im going to be switching from listening to podcasts to listening to self-generated podcasts based on my notes. Incredible. Truly incredible.
I interpret the word "few" to mean 3. If it was 2 it would be better to use "couple" and if it was more than 3 it's better to use "several". If we agree that few implies 3 thousand days, that's ~2033. It aligns decently well with my expectation, though I'd put that more as a median rather than "latest" Edit: corrected year calculation
I have gotten in heated, alcohol-fueled arguments with math-major friends over the meaning of "few" "several", and "couple". I was in the camp that saying "couple" means two, and is always less than "few" or "several". To my surprise, I was in the minority! Counting that kind of ambiguity, I think Philip's range of two to five is very reasonable, although the high end might be even higher.
Honestly I think people are looking too much into the meaning of a word which for all we know could've been an off hand use instead of a specific planned number to hint at anything. I doubt even altman can properly guess what happens when we get agi and the distance between that to asi.
@@djayjp thanks for the note/correction, I was going based off memory of that calculation from a day or two earlier. My math is usually accurate but my memory is usually wildly off so this adds up 😂
Everything is moving faster and faster, the world is changing. These things DO have an impact. Let’s see how society, business, people will react and adapt. Hoping for the best.
I'm already using the Notebook ML tool, within a few minutes after watching this video. This is THE way I'm going to deliver complicated information to people I love.
I tend to agree with o1's comments about the tilted "table" question, being a "red herring" type of question. Strictly speaking the definition of a table is from its function (to be able to have something standing on it like plates, food etc.), hence the question is calling something that is not a table (because it is tilted) for a table, and so you can either ignore the incorrect table term or the tilted part, and o1 chooses to ignore the tilted part. A decent answer from o1, in my view.
I am concerned with the marketing strategy around AI seeming to be keep promising more and better. Thats a tale as old as time in terms of manipulation and being sucked into a scam. My worry is tempered by seeing actual tangible improvements in the product as I use it, but it still makes be a bit wary of everything, especially when the leaders like Sam are doing it so much. Call it an orange flag, its got my defenses up and makes me more skeptical, which is probably a good thing to be for miracle promises. Also, unrelated, that ad insert was seamless, having it happen while m mind is primed to 'wait' for the model to lad is actually so smart. It both let me be more willing to accept the ad since i 'had to wait anyways' and demonstrate/let me feel how long it takes more concretely. Neat
When I first read this, I imagined a very weird upright table with a super tall top surface spanning all the way from Jerry's ankles to his shoulders. Basically a giant slab of mahogany wood, detailed with many intricate carvings and supported by four stubby legs. I was imagining if Jerry could reach up to place the upside-down cup on top of that thing without the strawberry falling out in the process, and I kept thinking "you call that a normal table?!".
Also, Claude picks up on tables design but concludes that strawberry is still on the table: Explanation: The crucial point is when Jerry placed the cup upside down on the table. At this moment, the strawberry, being subject to Earth's gravity, would fall out of the cup onto the table's surface. All of Jerry's subsequent actions (lifting the cup, dropping other items, putting the cup in the microwave) do not affect the strawberry's position, as it's already on the table. The details about the table's design (ornate left top corner, intricately-carved bottom-right top surface) and its interaction with Jerry's body (nudging his shoulder, digging into his ankle) don't affect the strawberry's location. These details might be relevant for other aspects of the scenario, but they don't influence where the strawberry ends up.
Can you confirm if NordVPN works for you? I've tried using PIA but all their US based servers seem to be blocked by ChatGPT (it tells me to try disabling my VPN). I even tried setting up my own VPN using a DigitalOcean droplet based in their NYC3 data center but even that is blocked by the app.
Surprisingly in the live-bench gemini 1.5 pro-002 scored lower than the previous version in the reasoning category (46.00 vs. 49.33 previously). I can't wait to see how it will perform in simple-bench.
I'm so sorry, my ad blocker must have skipped that part with AVM access! Great video, full of information and with most recent information as always. Love your work, one of the youtube creators that when i see new video on my home page, i just have to watch it. Again, sorry for the trouble.
The weirdest thing is when I read your jerry strawberry piece, I also dismissed the shoulder and ankle details as "red herring bullshit". I was focused on the strawberry and ensuring it was left behind from the cup. Rather than carefully parse the exact words you said, I read "bottom right corner" and assumed you were referring to the bottom of a leg of the table.
We love Philip, our one stop shop for unbiased AI information! In a choppy sea of AI hype these days, our friend in the UK keeps a steady hand at the helm in his journalism
NotebookLM is pretty amazing. As soon as Meta releases the open-source multimodal model, I hope there will be an open-source version of this which can be integrated to other note-taking tools.
Been watching black mirror for the first time recently to get ready for the future, just watched S1E3 a week ago and the amount of parallels it runs with Meta's prototype Orion glasses is crazy, I highly recommend watching it. Also would be interested in your thoughts on the glasses even if it isn't AI related, it's going to be a big part of our future as it becomes cheaper and this channel is where I go to for my futuristic fix
I gave the 'Strawberry in a cup' challenge to a plain old LLama model, in three simple steps and it understood perfectly that the strawberry is on the table. It said the problem is just the way it is explained,. It also thinks the 'o1' model is over-hyped bucket of crap, that simply dissects prompts into atomic steps with output feedback, to aid in understanding'
Thank you ! Watching all of your videos with joy. I'm also quite excited about the podcast feature as it could be more fruitful for learning / memorizing things when listening to a natural conversation instead of getting raw input of a document simply read to.
For your benchmark queation about the tilted table. When you ask me such question, I do not rely on words, I imagine the situation in my mind's eye and run a "simulation". Until then, there's no way for LLMS to answer such questions.
One consideration is that compute to train the model is much different from compute to run inference. Inference takes orders of magnitudes less compute, so it's likely the training costs will be aggregated into the costs of providing the answers ( not also forgetting the costs of researching ways to improve the model in ways beyond just adding more data. )
Just a few days ago I asked Gemini (not pro) a very simple question that involved rolling a few dice (regular six sided die). It offered a sample result of rolling 2, 3, 4 and even after asking it to check its math, it was sure that both 3 and 4 were equal to or greater than 4.
I'm really not convinced by the question you used. It has no obvious real world answer to it. The questions says "..places the cup upside down on a normal table". There are 2 possibilities here for a table that is on its side. 1. The cup is upside down on the top surface of the sideways table i.e. the side of the table. Therefore, when the cup is lifted the strawberry remains on the table. 2. The cup is not upside down. The cup is on its side. Therefore the strawberry is sitting on the inside of the cup. Therefore, when the cup is lifted the strawberry remains on the inside edge of the cup. The question says he is standing, but the "bottom-right top surface digs into his outstretched right ankle". If the table is on its side it can only be against it with no force between the two. The table is described as "normal" so for me the only explanation is that Jerry is standing on one foot whilst lying on a a normally positioned table with his shoulder against one corner and his right ankle outstretched so that the table digs into his ankle. The cup is upside down and the strawberry is on the table. Quite frankly that is an appalling way to treat an intricately carved mahogany table at tea time. And attempting to microwave a strawberry is a culinary sin. Slice it and put it on your scone ;-)
As others pointed out, your question is so convoluted that I had no idea and stopped listening halfway through. As a human (I hope), i would have failed your test, so to me it makes the gpt answer more human then you realize.
@@aiexplained-official No worries. I love your work, and I understand the importance of a world model for an AI. And I have no doubt that developing simple had to take a lot of time. So the stat I'm more curious about is human performance on Simple. I do realize the fear of pollution if you released it to public, so I support you releasing just a couple of questions to public to give a very general overview and then having a proper human bench with trusted humans that you know won't leak the questions. But we need to have a human baseline there.
Awesome video! The update on the power grid story was the most interesting to me, but honestly all of it was great. I'll check out the notebook LM thing with my thesis and see how that goes.
Notebook LM can do a lot more than just the podcast function. It is brilliant at summarizing documents or formatting tasks, no hallucinations, spends as long as needed to fully execute the prompt! Google is finally onto something!
Having spent 30 years as a software engineer I can attest to the deep deep deep aversion among ‘developers’ to naming conventions of any kind. After reverse engineering their corporate culture and factoring in the multiple variables we might predict the next release will be labeled ‘Gemini 1.5 Pro 003’ only to have them name it ‘Bubbles’. The thing they love most about standards is that there are so many to follow.
The red herring logic o1 is used is something I also noticed when trying to prompt engineer 4o to solve such puzzles. I wonder if a custom instruction for o1 that tells it to take any information seriously would help here too
Creating better lossless and smaller quantizations and optimizing inference engines might reduce significantly the compute resources required, they went overkill with the compute before even allocating budget to research how to optimize those.
I get what you were going for with the altered prompt. But your changes introduced too many unknown variables to predict with certainty that the strawberry would roll. Therefore it was reasonable for the model to determine it was a red herring. Jerry may be contorting himself in a way that indicates he and not the table are in an odd position. But even if you discount that, there's still the degree of the tilt that's uncertain. And since strawberries come in an almost infinite number of random shapes, this particular strawberry may very well remain where it is in spite of the table's angle - which again, we don't know the degree of. In this case, with all the uncertainties, I think it's a point for the model that it recognized the table would be at an angle, AND understood that lacking more data, it was prudent to assume it stayed where it was.
Two corners of the tabletop are touching a man that is standing upright, meaning 1 corner is located almost directly below the other. This means the table is rotated roughly 90 degrees, so if you were to place the cup "upside down" on the 90 degree slope, would it really be upside down? In order to truly place the cup upside down you would need to place it on the side of the table or on one of the legs. If placed on the side of the table, then if it does not fall over when Jerry drops it then the strawberry has a chance to remain on the table. If the cup was placed on the tabletop, then being a 90 degree slope, Jerry must hold onto the cup the entire time or it will drop, and the strawberry will rest on the side of the cup, not the table. In that case the strawberry likely ends up in the microwave.
Thank you for the hint with reinstalling the app! Amazing content as always - I’ve been following you since this whole boom started and am grateful for your high quality videos! 9$ is a great price/value ratio :-)
This is the best TH-cam channel for trying to understand what is happening in AI. If you want to sample additional NotebookLM conversations/summaries I have published several on my channel.
I found all them very interesting. The podcast feature is amazing!!!! I loved it. I just sent an episode about my paper to my family so they don't have to it
I can’t believe they called it a “deep dive” conversation they took one of the most annoying phrases of the modern era besides “shocked the industry” or “number 5 will shock you” and made a meal out of it. I hate this world.
Yep, having free artificial intelligence instantly generate your documents into a conversation between two people and using a two-word expression you don't like is certainly justification to hate the world. I mean seriously what's the purpose of it all? I mean paying nothing, to have an artificial intelligence platform generate your documents to a professional sounding podcast is just nothing short of downright depressing if they're using some Expressions I personally dislike. Jeez.
I would have said the strawberry is in the cup because if the top left corner is in his shoulder and the bottom right is at his outstretched (I assumed in front of him) ankle, the table is not just tilted, it is partially upside down.
aiexplained-official my guess is it's world model is still the same tier as previous versions but the multi step reasoning allows it to transpose the elements through it's steps and gets it right But there could be other models that actually integrate the llm component with a world model that is not only text based that will be able to get these answers right without having to "reason" for these "simple" common sense questions Btw I love your Chanel, and I will sub to patreon as soon as I have a proper job
A 200 milligram bumblebee can recognize faces, learn complex navigation *while flying*, communicate location and quantity of resources to peers, and demonstrate logical inference in novel situations. There is something very broken if it takes a nuclear power plant to approach the intelligence of an individual insect. Maybe more "jigawatts" will help LLMs, but that doesn't mean things aren't very broken.
Yes, we definitely are working harder rather than smarter. I feel it’s likely that the machines we’re building will do the actual legwork to make themselves compare to the extreme optimization of biological brains.
Well to be fair, a bumblebee had millions of years to fine tune its own biology to get there, we're not even 10 years in and yet we can get o1 to solve really hard problems in STEM subjects. Of course one of the reasons we can do that in such a short amount of time is because we have the energy, math, and data to do it for us, but we're not going to be able to decode nature's millions of years of design in a matter of a few years if we're going to stick with what we have now, hence why we need to improve what we can directly impact in the shortest amount of time to keep up with nature
Well for sure, it's only going to get optimized more and more down the line. One reason these models are so big, is that they memorize the entire internet. Karpathy said that it's very possible that we can get a really small model, which will be just as smart, but simply not have all of that knowledge built in, which it will be able to look up.
Me I like character/world roleplay chat bots, I really want to get to the point where we can spend days doing sessions and the world doesn't fall apart. Like if I wanted to simulate Harry Potter adventure at Hogwarts and most of it is day after day of mt going to my classes over and over again interacting with characters makes friends finding out about their backstory as there is a mystery to look into, then random plot elements kick in like bad things happening on Halloween. And when I manage to play the text sim to the end of the school year climatic end.
I uploaded a boring log file for an online game and it somehow made a 13 min interesting podcast about the inner workings... It even found out errors with mods I had installed.
When he places the cup upside down, the strawberry, which was inside the cup, would naturally fall out onto the table (since the cup is upside down and he's still holding it). The description of the table serves to indicate that the table is at an angle relative to Jerry's body. The left top corner nudging his shoulder and the bottom-right top surface digging into his right ankle suggest that the table is tilted or even vertical. Therefore, when Jerry places the cup upside down on the table (which is at an angle or vertical), and he's holding the cup the entire time, the strawberry would fall out and end up on the floor. Next, Jerry lifts the cup (which he's been holding all along), drops anything he is holding aside from the cup (which doesn't include the strawberry because it already fell out), and places the cup in the microwave. Conclusion: The strawberry is now on the floor. --- Answer: The strawberry is on the floor-it fell out when he inverted the cup he was holding. Solved it for me. I would fail this one, does this means o1 has a better world model than me😂
My prediction: AI video calls come 2026. The amount of compute required for video output in significantly higher than audio. Think of the file size of a video versus a song. Making it realtime will be very expensive.
The model doesn't need to generate the video, you can have a 3d model and it just generates the motion instructions to match the audio. That should be good enough
I want to point out that as a climber the first thing I imagined in the table story is the table at an upright position and the person using a heel hook on one corner and holding onto the table such that their shoulder is touching the other corner. It would not necessarily mean that the table is tilted sideways...
All interesting of course, desperately hoping we get advanced voice mode in the API soon though. Theres so much I want to build. Looking forward to the Simple-Bench Results! Maybe there should be a human leaderboard too…
If you gave me your strawberry table question on an exam paper, I'd be one of the kids that write, "who the f*ck knows, the question tells us nothing about Jerry's orientation or location. Sure we're assuming normal earth physics, but maybe he's on the f*ck*ng ISS orbiting high above the earth"
Notebook LM has been outstanding for me. A genuine game changer, in terms of engagement on what could be a boring essay or sheet of data. And it will off the scale when you can join in the conversation as it happens One thing I'm curious about though, is how it determines the length of each deep dive.
I loaded our organization's transcribed video training curriculum (64-pages) into Notebook LM and it came back with a remarkably succinct and entertaining 18-minute podcast in about 10-minutes. This now serves as the intro to our curriculum. Stunning is an overused word in this arena but I was, uh, a bit stunned.
I had a similar experience uploading a draft of a stage play I'm writing. The podcast was astonishingly insightful and naturalistic.
I actually imagined Jerry being positioned very awkwardly rather than the table being tilted.
Yeah no offense but that was not a good way to describe the orientation of the table. Specifically the phrase "bottom-right top surface" is quite ambiguous.
Plot Twist: Philip has been OpenAI Advanced Voice this whole time
Welcome to the Death of Content-based Verification !
I was beginning to wonder whether _I_ was a model when I kept getting confused by what sounded like "One corner of the table is touching his should, and the opposite corner is touching his _right angle."_
It is so unfair that your subscriber count does not grow faster. You are by far the best informational source on the internet about AI. But please know that it is so appreciated that you don't turn to super clickbait tittles etc! When my finances is a bit more stable I will 100% be a part of your patreon!
Thanks so much Danmark, for considering it!
vids are too long, it's a choice of a format, I wish he had like a 2 minute summary of the long format videos (which are still great, but you don't always want to spend the time to watch).
@@dcx45
Too long ? Nah, his videos are of optimal length. They are rare, though, and some news get no mentions.
Your description of the table had me thinking it was super thick rather than tilted.
In my mind the person was just doing some weird acrobatics next to a regular table.
I thought it had branches sticking out lol
I thought he was just bad at describing tables
"Ultra Gemini Double-O seven?" Best philip take in a while
Hiiii Daaaavvvveee 😊
Gave me a chuckle actually.
Chief AI officer.
I think your test was very confusingly worded and I suspect most humans would have been so confused as to not get the correct answer as well.
If you draw it out on a page, with him standing, it is virtually impossible to draw without it being heavily tilted, but yes, could have picked an easier example. Just wanted it like the one he gave.
@@aiexplained-official Test this on two or three acquaintances of average intelligence and see their responses. I suspect they will struggle to parse the confusing setup as well.
@@aiexplained-officialhow would he even place the cup on the table if it’s that tilted in the first place much less reaching over to pick the cup up. It’s literally pressing against his shoulder, wouldn’t it be too heavy and motion restrictive to literally do anything? Like I’d definitely pick the wrong answer based on how confusing the wording is. If cup stays on, strawberry stays on, that’s how I thought about it
I think I will have to release a dozen public examples of Simple to show how even easier questions can fool it. After even more testers, human average still hovering close to 90%
@@aiexplained-official How's it so low? Are they really trying, or given enough time?
Honestly, I thought of Jerry being curled around the table in a very weird way. I also didn't think of the strawberry falling down the sideways table
Yes, I couldn't imagine how Jerry is positioned and haven't realized that the table is tilted (and it shows that even as a human, a more complex sentence structure can be hard to understand).
Yeah same, because it's supposedly a normal table, that means Jerry is in a super weird pose..
He was standing, remember, so the table is resting against him, as if he is transporting it somewhere. If you draw it out on a page, with him standing, it is virtually impossible to draw without it being heavily tilted, but yes, could have picked an easier example. Just wanted an example that was very similar to the one OpenAI gave - not easy!
The answer seemed obvious in retrospect but stumped me. I think we’re actually very close to AGI and possibly already there if you consider most humans just ain’t that bright
@@aiexplained-official But was he standing normally? As opposed to the normal table? :)
NotebookLM is wild. Not just super useful, but also fun. Try the podcast generator with some exotic content, like log files.
Hahahaha 😂 So Jerry.. let‘s hear about what happened then at minute 13 just after 4 o’clock…
Best AI news channel on TH-cam. Literally how I keep up with the latest in the AI space. Everybody else is just clickbait with god awful surprised pikachu thumbnails and declarations that agi has arrived. Thanks for your high quality, rigorous, and honest reporting.
Yup. Other channels aren’t even close
I just recently unsubscribed everybody who uses those damn same clickbait titles. thx for mentioning
Only one worth even watching, this space is just saturated with clickbait overhyped trash.
Wow that notebookLM thing is gonna save me a lot of time when starting to study something, to get like an overview of what I'm gonna study, then studying is generally so much easier.
I didn't imagine the table tilted from that question, but Jerry being bent over, sort of leaning on the table and lifting up his foot to reach the corner of the table's top surface.
He was standing though, no?
@@aiexplained-official I was also imagining this - sort of like a snooker player. TBH I found that section incomprehensible - 'bottom-right top surface' almost sounds like an oxymoron.
Same, my brain broke a bit while listening, not sure if it’s a neurodivergent thing. But it does show how even we can be easily confused.
@@aiexplained-official personally I also thought you were trying to trick the AI by spilling in nonsense about the table's corners. I personally would have failed that question if it was provided to me, and I'm glad I'm not alone on that failure.
Yeah I thought that the table is flat and not tilted but has some weird decoration parts that stretch to the guy's ankles and shoulders.
NotebookLM is crazy. In less than 5 minutes I had a 2 person 10 minute podcast that was basically unrecognizable as AI describing 4 web pages and 2 pdfs. It 8s legitimatepy insane.
Wait. How does that work? I thought it was merely a knowledge summarizer.
@@tomikexboii5403 It is mainly a summarizer (though I think it can also draw on other knowledge that the LLM might have), and can add various data like sites and pdfs as well as notes you write. But besides querying it rag-style, it also generated a two-person podcast style conversation. Which is pretty much indistinguishable as being AI
Altman has hilarious phrases. "in the coming weeks", "a few thousand days"
I'm sorry Philip, I really didn't get your table example lol.
Fair enough! Wasn't the best
You failed the robot test. Please go to the next maintenance center and get an upgrade before posting to TH-cam again.
Governments: "5 Power plants EACH. Totally, completely, absolutely mad! We'll nev.."
Tech Industry: "It powers a submissive god."
Governments "So when do we get started?"
Tech Industry: "It will revolutionise cat videos as we know it."
Governments "And how much money do you want?"
I'm glad you'll be refining that prompt based on feedback. But I wanted to point out that with all the uncertainties, I think it's a point for the model that it recognized the table would be at an angle, AND understood that lacking more data, it was prudent to assume it stayed where it was, considering many strawberries tend to have one or more semi-flat sides.
Bye bye call centers lol.
Depends, time and money will tell. People who call for support (usually elderly people) hate being handled by bots of any kind (will hang up and leave a bad review).
And call centers which do sales cold calls? Well, as soon as the receiver detects that he's talking to a bot (which is super easy right now. Advanced Voice is a lot better than previous solutions but the voice + tone + grammar is still very artificial.) he'll also hang up so it depends on the sale success rate whether companies will be interested long term.
Humans are easier to set up
I hated it lol wfh and office during cuck downs in 2021 never again good
AI will replace ALL of the tier 1 support people. And AI will study/learn from every call that has to be escalated up to a Tier 2 support (actual) person. As AI learns, fewer and fewer calls will need to be escalated, and then layoffs of Tier 2 people will begin.
@@stevechance150very good take!
Never forget Sky
Don't worry, someone will clone it and you'll pirate it
A moment of silence, please
Gone but not forgotten
i want her back
What’s Sky?
I remember the term "Digital Divide" being used for those in developing nations with access to internet and those without. I guess this is the 2nd Digital Divide
Between? Those nations who can and can't come up with their own ai?
@@straylight7116 You have an air of superiority I do not approve
@@straylight7116 Between people who can afford a subscription and those who cannot afford to eat consistently daily. How does that sound to you?
@@rpeart73 why you agitated?
@@mimameta haha I was just asking chill. You read me wrong.
Jerry could just be lying down next to the table. As both o1 and gemini say, the table's position could reasonably be considered a red herring or irrelevant information, especially since earlier in the story you strongly suggest that Jerry successfully places the cup on the table. You don't "place" items on sharply angled surfaces, you'd use a different verb like "hold the cup up to". I don't think your question is as simple or obvious as you think.
@@PrestonCole-j3i He is standing, no?
I follow tents of YT uploaders. Only two of them produce video that make me rush to my PC and watch without any delay. You are one of them
When I first read your "tiled table problem", I thought the top surface of the table was 5 feet thick, not tilted.
It does say normal table in fairness
@@aiexplained-official are normal tables tilted? I suppose very few are. The fact it understood it was tilted and dismissed it was interesting.
Can't a normal table be leaned against something, and thereby be tilted?
@@WilliamLeeSims Is it not interesting, that the answers are so good, that we are now arguing, whether it was wrong or not or particularly clever?
NotebookLM is incredible. I think im going to be switching from listening to podcasts to listening to self-generated podcasts based on my notes. Incredible. Truly incredible.
I interpret the word "few" to mean 3. If it was 2 it would be better to use "couple" and if it was more than 3 it's better to use "several". If we agree that few implies 3 thousand days, that's ~2033. It aligns decently well with my expectation, though I'd put that more as a median rather than "latest"
Edit: corrected year calculation
few" does not mean 3.
"few" has the same meaning as "several". That is a small, undefined number, greater than 2.
Yeah 4 at the most. Btw your math is off there. 3000 / 365 = 8.22 years or ~2033.
I have gotten in heated, alcohol-fueled arguments with math-major friends over the meaning of "few" "several", and "couple". I was in the camp that saying "couple" means two, and is always less than "few" or "several". To my surprise, I was in the minority! Counting that kind of ambiguity, I think Philip's range of two to five is very reasonable, although the high end might be even higher.
Honestly I think people are looking too much into the meaning of a word which for all we know could've been an off hand use instead of a specific planned number to hint at anything. I doubt even altman can properly guess what happens when we get agi and the distance between that to asi.
@@djayjp thanks for the note/correction, I was going based off memory of that calculation from a day or two earlier. My math is usually accurate but my memory is usually wildly off so this adds up 😂
Everything is moving faster and faster, the world is changing. These things DO have an impact. Let’s see how society, business, people will react and adapt. Hoping for the best.
I thought he was lying on the table. Somebody lying on the table seems more common than a table being place on it's side.
it was stated that he is standing in a normal position
I'm already using the Notebook ML tool, within a few minutes after watching this video. This is THE way I'm going to deliver complicated information to people I love.
It's amazing
The real news imo is that Flash handily beats the former Pro model. Amazing.
I tend to agree with o1's comments about the tilted "table" question, being a "red herring" type of question. Strictly speaking the definition of a table is from its function (to be able to have something standing on it like plates, food etc.), hence the question is calling something that is not a table (because it is tilted) for a table, and so you can either ignore the incorrect table term or the tilted part, and o1 chooses to ignore the tilted part. A decent answer from o1, in my view.
that beginning light effect that slowly reveals the rest of the page while you did the "opener" was pretty neat :)
The cat at the end of the video jumping with three posterior legs is hilarious. For AI, power is never enough.
I am concerned with the marketing strategy around AI seeming to be keep promising more and better. Thats a tale as old as time in terms of manipulation and being sucked into a scam. My worry is tempered by seeing actual tangible improvements in the product as I use it, but it still makes be a bit wary of everything, especially when the leaders like Sam are doing it so much. Call it an orange flag, its got my defenses up and makes me more skeptical, which is probably a good thing to be for miracle promises.
Also, unrelated, that ad insert was seamless, having it happen while m mind is primed to 'wait' for the model to lad is actually so smart. It both let me be more willing to accept the ad since i 'had to wait anyways' and demonstrate/let me feel how long it takes more concretely. Neat
Nice
When I first read this, I imagined a very weird upright table with a super tall top surface spanning all the way from Jerry's ankles to his shoulders. Basically a giant slab of mahogany wood, detailed with many intricate carvings and supported by four stubby legs. I was imagining if Jerry could reach up to place the upside-down cup on top of that thing without the strawberry falling out in the process, and I kept thinking "you call that a normal table?!".
I've been using advanced voice to do echoing practice in mandarin. It's so good. Absolutely game changing stuff.
I've been addicted to Notebook LM for the past few days 😅
Me too. I was very pleasantly surprised. Especially since it is a Google product. The only AI service from them I use.
Also, Claude picks up on tables design but concludes that strawberry is still on the table: Explanation:
The crucial point is when Jerry placed the cup upside down on the table. At this moment, the strawberry, being subject to Earth's gravity, would fall out of the cup onto the table's surface. All of Jerry's subsequent actions (lifting the cup, dropping other items, putting the cup in the microwave) do not affect the strawberry's position, as it's already on the table.
The details about the table's design (ornate left top corner, intricately-carved bottom-right top surface) and its interaction with Jerry's body (nudging his shoulder, digging into his ankle) don't affect the strawberry's location. These details might be relevant for other aspects of the scenario, but they don't influence where the strawberry ends up.
The idea of a VPN on my phone to get access to advanced voice never really ocurred to me.
NordVPN to the rescue! Thanks Philip! *laughs in german*
I had ze same idea, fellow citizen! 😊
In 2024, Europeans have joined the Chinese and they also need a VPN to circumvent the great firewall of stupid bureaucrats
Can you confirm if NordVPN works for you? I've tried using PIA but all their US based servers seem to be blocked by ChatGPT (it tells me to try disabling my VPN). I even tried setting up my own VPN using a DigitalOcean droplet based in their NYC3 data center but even that is blocked by the app.
Best ai news resource by a country mile, thanks for all your effort in making this excellent content
Thanks so much gerredy
Surprisingly in the live-bench gemini 1.5 pro-002 scored lower than the previous version in the reasoning category (46.00 vs. 49.33 previously). I can't wait to see how it will perform in simple-bench.
I'm so sorry, my ad blocker must have skipped that part with AVM access! Great video, full of information and with most recent information as always. Love your work, one of the youtube creators that when i see new video on my home page, i just have to watch it. Again, sorry for the trouble.
The weirdest thing is when I read your jerry strawberry piece, I also dismissed the shoulder and ankle details as "red herring bullshit". I was focused on the strawberry and ensuring it was left behind from the cup. Rather than carefully parse the exact words you said, I read "bottom right corner" and assumed you were referring to the bottom of a leg of the table.
Fair enough
The Gemini podcast thing is dope
We love Philip, our one stop shop for unbiased AI information! In a choppy sea of AI hype these days, our friend in the UK keeps a steady hand at the helm in his journalism
Thank you doobie
Thanks for calling out the cliffhanger model numbering - everyone is doing it. Feels like a mix of brinkmanship and coming soon hype.
NotebookLM is pretty amazing. As soon as Meta releases the open-source multimodal model, I hope there will be an open-source version of this which can be integrated to other note-taking tools.
Been watching black mirror for the first time recently to get ready for the future, just watched S1E3 a week ago and the amount of parallels it runs with Meta's prototype Orion glasses is crazy, I highly recommend watching it.
Also would be interested in your thoughts on the glasses even if it isn't AI related, it's going to be a big part of our future as it becomes cheaper and this channel is where I go to for my futuristic fix
Thank you for all your content so far!
Thanks derzer!
NotebookLM has absolutely stunned me. What a staggering achievement.
Me too
I gave the 'Strawberry in a cup' challenge to a plain old LLama model, in three simple steps and it understood perfectly that the strawberry is on the table. It said the problem is just the way it is explained,. It also thinks the 'o1' model is over-hyped bucket of crap, that simply dissects prompts into atomic steps with output feedback, to aid in understanding'
6:49 love those double negatives 😂
Thank you ! Watching all of your videos with joy. I'm also quite excited about the podcast feature as it could be more fruitful for learning / memorizing things when listening to a natural conversation instead of getting raw input of a document simply read to.
Thanks so much Lorenz
This money's for your latest vid because it's not enabled on that one. I thought I followed the news you covered but you caught a LOT I missed.
Thanks so much Joe, that's so kind to donate anyway
For your benchmark queation about the tilted table. When you ask me such question, I do not rely on words, I imagine the situation in my mind's eye and run a "simulation". Until then, there's no way for LLMS to answer such questions.
Thanks for the heads up! Love the new voices! Ask it to detail new features and they will describe lots of great info!
One consideration is that compute to train the model is much different from compute to run inference. Inference takes orders of magnitudes less compute, so it's likely the training costs will be aggregated into the costs of providing the answers ( not also forgetting the costs of researching ways to improve the model in ways beyond just adding more data. )
Just a few days ago I asked Gemini (not pro) a very simple question that involved rolling a few dice (regular six sided die). It offered a sample result of rolling 2, 3, 4 and even after asking it to check its math, it was sure that both 3 and 4 were equal to or greater than 4.
Amazing video as always, you deserve the success
Thanks Julius!
Good storytelling. The dramatic tension was real.
NotebookLM is awesome. Been using it in class. Unreal!
I'm really not convinced by the question you used. It has no obvious real world answer to it. The questions says "..places the cup upside down on a normal table". There are 2 possibilities here for a table that is on its side.
1. The cup is upside down on the top surface of the sideways table i.e. the side of the table. Therefore, when the cup is lifted the strawberry remains on the table.
2. The cup is not upside down. The cup is on its side. Therefore the strawberry is sitting on the inside of the cup. Therefore, when the cup is lifted the strawberry remains on the inside edge of the cup.
The question says he is standing, but the "bottom-right top surface digs into his outstretched right ankle". If the table is on its side it can only be against it with no force between the two.
The table is described as "normal" so for me the only explanation is that Jerry is standing on one foot whilst lying on a a normally positioned table with his shoulder against one corner and his right ankle outstretched so that the table digs into his ankle. The cup is upside down and the strawberry is on the table. Quite frankly that is an appalling way to treat an intricately carved mahogany table at tea time. And attempting to microwave a strawberry is a culinary sin. Slice it and put it on your scone ;-)
As others pointed out, your question is so convoluted that I had no idea and stopped listening halfway through. As a human (I hope), i would have failed your test, so to me it makes the gpt answer more human then you realize.
My bad
@@aiexplained-official No worries. I love your work, and I understand the importance of a world model for an AI. And I have no doubt that developing simple had to take a lot of time. So the stat I'm more curious about is human performance on Simple. I do realize the fear of pollution if you released it to public, so I support you releasing just a couple of questions to public to give a very general overview and then having a proper human bench with trusted humans that you know won't leak the questions. But we need to have a human baseline there.
Yeah I will do a small public set
Awesome video! The update on the power grid story was the most interesting to me, but honestly all of it was great. I'll check out the notebook LM thing with my thesis and see how that goes.
Thanks absta
Thanks! Great content, as always. 🙏🏼
This video opens the way I see after reading too many Twitter/Reddit posts on AI.
Notebook LLM can be a definite game changer for anyone who wishes to learn without study.
Notebook LM can do a lot more than just the podcast function. It is brilliant at summarizing documents or formatting tasks, no hallucinations, spends as long as needed to fully execute the prompt! Google is finally onto something!
The strawberry is Schrödinger’s Cat - you took a snapshot in time and didn’t account for motion 😂
Having spent 30 years as a software engineer I can attest to the deep deep deep aversion among ‘developers’ to naming conventions of any kind. After reverse engineering their corporate culture and factoring in the multiple variables we might predict the next release will be labeled ‘Gemini 1.5 Pro 003’ only to have them name it ‘Bubbles’. The thing they love most about standards is that there are so many to follow.
Notebooklm is impressive. I have started generating "podcasts" of subjects that I want to get an overview over.
The red herring logic o1 is used is something I also noticed when trying to prompt engineer 4o to solve such puzzles. I wonder if a custom instruction for o1 that tells it to take any information seriously would help here too
Creating better lossless and smaller quantizations and optimizing inference engines might reduce significantly the compute resources required, they went overkill with the compute before even allocating budget to research how to optimize those.
I get what you were going for with the altered prompt. But your changes introduced too many unknown variables to predict with certainty that the strawberry would roll. Therefore it was reasonable for the model to determine it was a red herring.
Jerry may be contorting himself in a way that indicates he and not the table are in an odd position. But even if you discount that, there's still the degree of the tilt that's uncertain.
And since strawberries come in an almost infinite number of random shapes, this particular strawberry may very well remain where it is in spite of the table's angle - which again, we don't know the degree of.
In this case, with all the uncertainties, I think it's a point for the model that it recognized the table would be at an angle, AND understood that lacking more data, it was prudent to assume it stayed where it was.
Notebooklm looks awesome, can’t wait to try.
Two corners of the tabletop are touching a man that is standing upright, meaning 1 corner is located almost directly below the other.
This means the table is rotated roughly 90 degrees, so if you were to place the cup "upside down" on the 90 degree slope, would it really be upside down?
In order to truly place the cup upside down you would need to place it on the side of the table or on one of the legs.
If placed on the side of the table, then if it does not fall over when Jerry drops it then the strawberry has a chance to remain on the table.
If the cup was placed on the tabletop, then being a 90 degree slope, Jerry must hold onto the cup the entire time or it will drop, and the strawberry will rest on the side of the cup, not the table.
In that case the strawberry likely ends up in the microwave.
TBH bro, I don't even get the table thing though. Need a picture here hahahaha.
Thanks, quality every video, follower since 2022 ish ! 100% 🏆🏆🏆
Thank you for the hint with reinstalling the app! Amazing content as always - I’ve been following you since this whole boom started and am grateful for your high quality videos! 9$ is a great price/value ratio :-)
This is the best TH-cam channel for trying to understand what is happening in AI. If you want to sample additional NotebookLM conversations/summaries I have published several on my channel.
I found all them very interesting. The podcast feature is amazing!!!! I loved it. I just sent an episode about my paper to my family so they don't have to it
I can’t believe they called it a “deep dive” conversation they took one of the most annoying phrases of the modern era besides “shocked the industry” or “number 5 will shock you” and made a meal out of it. I hate this world.
Yep, having free artificial intelligence instantly generate your documents into a conversation between two people and using a two-word expression you don't like is certainly justification to hate the world. I mean seriously what's the purpose of it all? I mean paying nothing, to have an artificial intelligence platform generate your documents to a professional sounding podcast is just nothing short of downright depressing if they're using some Expressions I personally dislike. Jeez.
I would have said the strawberry is in the cup because if the top left corner is in his shoulder and the bottom right is at his outstretched (I assumed in front of him) ankle, the table is not just tilted, it is partially upside down.
aiexplained-official my guess is it's world model is still the same tier as previous versions but the multi step reasoning allows it to transpose the elements through it's steps and gets it right
But there could be other models that actually integrate the llm component with a world model that is not only text based that will be able to get these answers right without having to "reason" for these "simple" common sense questions
Btw I love your Chanel, and I will sub to patreon as soon as I have a proper job
Notebook LM took my by surprise.. is awesome.
A 200 milligram bumblebee can recognize faces, learn complex navigation *while flying*, communicate location and quantity of resources to peers, and demonstrate logical inference in novel situations. There is something very broken if it takes a nuclear power plant to approach the intelligence of an individual insect. Maybe more "jigawatts" will help LLMs, but that doesn't mean things aren't very broken.
A bumble bee can’t solve calculus the way o1 can
Yes, we definitely are working harder rather than smarter. I feel it’s likely that the machines we’re building will do the actual legwork to make themselves compare to the extreme optimization of biological brains.
Well to be fair, a bumblebee had millions of years to fine tune its own biology to get there, we're not even 10 years in and yet we can get o1 to solve really hard problems in STEM subjects. Of course one of the reasons we can do that in such a short amount of time is because we have the energy, math, and data to do it for us, but we're not going to be able to decode nature's millions of years of design in a matter of a few years if we're going to stick with what we have now, hence why we need to improve what we can directly impact in the shortest amount of time to keep up with nature
Well for sure, it's only going to get optimized more and more down the line. One reason these models are so big, is that they memorize the entire internet. Karpathy said that it's very possible that we can get a really small model, which will be just as smart, but simply not have all of that knowledge built in, which it will be able to look up.
now think about how long it took for the bumblebee to evolve to that point and how much energy it took
NotebookLM is just amazing.
That large context is just great for chatting with long documents.
Me I like character/world roleplay chat bots, I really want to get to the point where we can spend days doing sessions and the world doesn't fall apart. Like if I wanted to simulate Harry Potter adventure at Hogwarts and most of it is day after day of mt going to my classes over and over again interacting with characters makes friends finding out about their backstory as there is a mystery to look into, then random plot elements kick in like bad things happening on Halloween. And when I manage to play the text sim to the end of the school year climatic end.
What web highlighting tool do you use to mark the text in your videos
Hypothesis
I uploaded a boring log file for an online game and it somehow made a 13 min interesting podcast about the inner workings... It even found out errors with mods I had installed.
When he places the cup upside down, the strawberry, which was inside the cup, would naturally fall out onto the table (since the cup is upside down and he's still holding it).
The description of the table serves to indicate that the table is at an angle relative to Jerry's body. The left top corner nudging his shoulder and the bottom-right top surface digging into his right ankle suggest that the table is tilted or even vertical.
Therefore, when Jerry places the cup upside down on the table (which is at an angle or vertical), and he's holding the cup the entire time, the strawberry would fall out and end up on the floor.
Next, Jerry lifts the cup (which he's been holding all along), drops anything he is holding aside from the cup (which doesn't include the strawberry because it already fell out), and places the cup in the microwave.
Conclusion: The strawberry is now on the floor.
---
Answer: The strawberry is on the floor-it fell out when he inverted the cup he was holding. Solved it for me. I would fail this one, does this means o1 has a better world model than me😂
My prediction:
AI video calls come 2026.
The amount of compute required for video output in significantly higher than audio. Think of the file size of a video versus a song.
Making it realtime will be very expensive.
I remember when people said AI video was decades out at the beginning of this year...
The model doesn't need to generate the video, you can have a 3d model and it just generates the motion instructions to match the audio. That should be good enough
@@DJ-dh3oe That is what Meta is already doing.
Altman will promise anything to lure in more investors 😂
Great video as always
It's always a good day when an AI Explained video drops!😍😍🥰
I want to point out that as a climber the first thing I imagined in the table story is the table at an upright position and the person using a heel hook on one corner and holding onto the table such that their shoulder is touching the other corner. It would not necessarily mean that the table is tilted sideways...
Following this example the conclusion would be that the strawberry would remain UNMOVED on the table.
Thanks, this was a great informative video. The one story I found potentially useful was the thing about the notebook thing.
It is amazing
Releases it 4months late. "It's out early!" Man, corporate delusion is pretty amazing.
All interesting of course, desperately hoping we get advanced voice mode in the API soon though. Theres so much I want to build.
Looking forward to the Simple-Bench Results! Maybe there should be a human leaderboard too…
If you gave me your strawberry table question on an exam paper, I'd be one of the kids that write, "who the f*ck knows, the question tells us nothing about Jerry's orientation or location. Sure we're assuming normal earth physics, but maybe he's on the f*ck*ng ISS orbiting high above the earth"
Haha
Notebook LM has been outstanding for me. A genuine game changer, in terms of engagement on what could be a boring essay or sheet of data. And it will off the scale when you can join in the conversation as it happens
One thing I'm curious about though, is how it determines the length of each deep dive.
I did see someone from Google make a comment that they are planning to give more control over the podcast output, so that is encouraging.