AI Doom | Google's STUNNING Videogame Generation Model BREAKS the Videogame Industry...

Wes Roth

มุมมอง 158 326

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 15 ก.ย. 2024
The latest AI News. Learn about LLMs, Gen AI and get ready for the rollout of AGI. Wes Roth covers the latest happenings in the world of OpenAI, Google, Anthropic, NVIDIA and Open Source AI.
My Links 🔗
➡️ Subscribe: / @wesroth
➡️ Twitter: x.com/WesRothM...
➡️ AI Newsletter: natural20.beeh...
#ai #openai #llm
Diffusion Models Are Real-Time Game Engines
gamengen.githu...
DIFFUSION MODELS ARE REAL-TIME GAME ENGINES
arxiv.org/pdf/...
John Carmack's AGI Effort
keenagi.com/

ความคิดเห็น • 741

@indycinema 16 วันที่ผ่านมา ⁺¹¹⁹
I like your coverage. Literally HATE your clickbait titles. Nothing is broken or a game changer every week. It's just a demo.
@Fermion. 14 วันที่ผ่านมา ⁺¹⁶
Agreed. Instant thumbs down and unsubscribe (if applicable) and I turn on ad-block when they do this clickbait foolishness.
We're all adults here. No need for childish titles.
@turolretar 11 วันที่ผ่านมา ⁺⁴
The clickbait is why I clicked, so I see nothing wrong with it
@dfreshMC 10 วันที่ผ่านมา ⁺²
I'm seeing this more and more where I might love a channel love their videos but I swear to God I cannot unsubscribe fast enough if they do that clickbait crap I can't stand it You know how many videos say oh my God scientists fine asteroid has life guaranteed announcement to come soon and then you watch the video and it's nothing about that nothing oh they should be able to be put in jail
@Crittek 8 วันที่ผ่านมา ⁺¹
Bad bot
@morrisd9 6 วันที่ผ่านมา
Well it’s a complex problem. If you want views you have to play th algorithm game.
@johannesdolch 17 วันที่ผ่านมา ⁺¹⁰⁷
That's not the kind of AI Doom scenario i expected.
@OscarTheStrategist 16 วันที่ผ่านมา ⁺⁷
Superb comment 🤘🏼
@seudonak 16 วันที่ผ่านมา ⁺⁴
Haha!
@brawth 11 วันที่ผ่านมา ⁺¹
@@OscarTheStrategist Read, as always, in Shao Kahn's voice
@BudEightySeven 7 วันที่ผ่านมา
Nice lol Cheers
@ItsJustAdrean 6 วันที่ผ่านมา
It's a real doom scenario. For all humanity. Taking all of our jobs and replacing human connection
@Ben_D. 17 วันที่ผ่านมา ⁺¹⁴¹
I like how Deep Mind doesnt promise anything. They just publish stuff.
'Hey! Look! We can now fold protiens. We can now create new synthetic materials. We can now do videogames.'
@pvanukoff 16 วันที่ผ่านมา ⁺¹¹
Can, could, might, may ... yep, that's all I've heard from the AI field in general recently. Wake me up when we these things are actually available to the general public, until then, it's all vaporware.
@gjb1million 16 วันที่ผ่านมา ⁺⁷
💯
Demis Hassabis > Sam Altman?
@moderncontemplative 16 วันที่ผ่านมา ⁺¹
Diffusion models are a true "black box" problem! Excellent breakdown but I don’t think music in the background is necessary IMHO. I Love the channel though.
@delphicdescant 16 วันที่ผ่านมา ⁺⁸
@@pvanukoff Oh it's not all vaporware. It's just that the good stuff is secured deep within the vaults of the hyper-wealthy. It's real, it's just not for you.
@snowcones8292 16 วันที่ผ่านมา ⁺¹
@@pvanukoffyou're so extremely dense bro
@mrgoober6320 16 วันที่ผ่านมา ⁺¹⁰
An infinite ad hoc FPS is just entertainment, but an infinite ad hoc RPG is an early version of the Matrix.
@johnnyjohnjohn4216 วันที่ผ่านมา
The best analagy we have at the moment is 'we live in a simulation.
Code creates the mountains, the rivers, the trees, the birds in the wind...the wind itself.
In this simulation, our code is consciousness.
There are innumerable 'games' being played at the same time, in the same space. We call these dimensions.
Consciousness can and does transcend space, time, and dimensions.
Edit: and it is our unconscious/subconscious aspects that are reflected to us
""In the province of the mind what one believes to be true, either is true or becomes true within certain limits. These limits are to be found experimentally and experientially. When so found these limits turn out to be further beliefs to be transcended." John C Lilly
"The universe is an engine of narrative" Terrence McKenna
"This is my real country! I belong here. This is the land I have been looking for all my life, though I never knew it till now... Come further up, come further in!" CS Lewis
@ThexBorg 17 วันที่ผ่านมา ⁺³⁴
This will mean that most of the in-game architecture and objects don't need to be built. They only need text-based descriptions. Any actions on those objects can be saved as text. The AI interprets it all. The NPC's will be the same. Games are going to change in a big way.
@absolstoryoffiction6615 16 วันที่ผ่านมา ⁺³
I only have one question... How much Memory Data do you think these games will require?
Algorithm Generation on a dynamic level into the game requires a lot of memory over time unless the player is able to delete those files/data in-game. Without deleting the Save File.
Astroneer and many Sand Box games have a cap on how much data the game can handle before crashing.
@FloatingOer 16 วันที่ผ่านมา ⁺²
@@absolstoryoffiction6615 Probably not that much data for the actual gameplay, anything created during the game would probably only need to be saved as text files, it doesn't need to track separate objects since it's just a picture. It's like an LLM for video games, though the generator would then be far too large for anyone to store locally so I'd assume online play only if you want real time generation. There are AI models that can generate images already so this is "just" asking it to "predict the next frame", but with player input and fast enough frame generation.
@somechrisguy 16 วันที่ผ่านมา ⁺⁷
We could basically give it a novel or script from a movie/series and have it create a game universe in that world
@1flash3571 15 วันที่ผ่านมา ⁺²
@@absolstoryoffiction6615 It is pretty big. Tesla is already been doing this type of thing training the FSD for it's fleet. They can simulate the roads, and it's conditions using all the Data they have and generate a scenario and situations on the road.
@TheExileFox 15 วันที่ผ่านมา ⁺²
in a weird sense, The Sims actually did something slightly similar 20 years ago. The key is that in The Sims, it's not the characters that look for objects, it's the objects that advertise their descriptions to the characters and this is for performance reasons.
@ChrisSherlock 17 วันที่ผ่านมา ⁺⁶²
Commander Keen!
@ErichReich 16 วันที่ผ่านมา ⁺⁸
We’re old 😂
@bubblesculptor 12 วันที่ผ่านมา ⁺¹
That's back when i programmed in GWBASIC
@bruceleroy6551 9 วันที่ผ่านมา
On 3.5 lol
@justinanderson267 7 วันที่ผ่านมา ⁺¹
I remember plating that game with this weird joystick where you could screw a little peg in the middle of the D-Pad.
I lost that piece almost immediately xD
@sez1742 4 วันที่ผ่านมา ⁺¹
Lol yes! 486….shareware….Doom!
@retrofuturism 17 วันที่ผ่านมา ⁺¹⁰
Simulating different pi worlds:
We could create virtual environments where pi has different values, allowing us to observe and study the effects.
Scientific breakthroughs:
New perspectives on the anthropic principle and fine-tuning in cosmology
Insights into the role of fundamental constants in complex systems
Potential discovery of hidden mathematical relationships
@TheGoodContent37 15 วันที่ผ่านมา ⁺³⁰
Imagine getting born just to exist inside a DOOM game for all eternity as a research subject
@potat0-c7q 14 วันที่ผ่านมา ⁺¹¹
I don't have to imagine it, I was born to just respond to youtube comments
@benayers8622 11 วันที่ผ่านมา ⁺³
@@potat0-c7q the future is here 😞
@Wolfsheim23 9 วันที่ผ่านมา ⁺¹
Imagine an AI singularity being born into a fully realized virtual reality? Then who's to say it hasn't already happened? If we can do that, then it's probably already happened to us. Like in the plot of The 13th Floor. 2nd best sci fi concept since The Matrix IMHO. We become god at that point. I'm not even sure if True AI is possible because it leads. I don't think we will see it in our lifetime if it is even possible.
@Paraselene_Tao 6 วันที่ผ่านมา
If life is but an AI-genned dream or the imagination of GaWd or a nightmare of a butterfly or whatever else, then I accept my position in the dream or imagination or nightmare or whatever. I will laugh and cheerfully go on about my absurd life, like nothing ever mattered. 😂
@Wolfsheim23 6 วันที่ผ่านมา
@@Paraselene_Tao You will gladly eat the Steak knowing your trapped in a VR Prison.. as long as your Rich! Ha
@ArunNairOne 16 วันที่ผ่านมา ⁺⁹
Think of the process like sculpting from a block of stone. The random noise is like the raw stone block, and the model learns how to chisel away the noise (the excess stone) to reveal the final image (the sculpture). By learning how to remove noise in this controlled manner, the model can generate complex images that look like they were naturally created, even though they started from nothing but noise.
@diliupg 10 วันที่ผ่านมา
Noise has everything, It selects what is required for a given frame.
@Dina_tankar_mina_ord 17 วันที่ผ่านมา ⁺⁴⁹
Imagine playing a game with flux or midjorney type detailed graphics. When the graphics shift from geometry-based polygons to highly detailed 2D images that appear 3D, complete with fake volymetric particles, ray tracing, and super smooth angles in every frame. Achieving that level of detail at 60 fps is a long journey ahead. But once it's achieved, the graphics will be more beautiful than real life.
@absolstoryoffiction6615 16 วันที่ผ่านมา ⁺²
It will be Gumball, with different animation styles and art styles being dynamic instead of static to the player.
@N1h1L3 16 วันที่ผ่านมา ⁺¹¹
Add some extra years for the VR versions. What a time to be alive !
@eSKAone- 16 วันที่ผ่านมา ⁺⁶
Maybe the simulation we live in is more beautiful than real life
@eSKAone- 16 วันที่ผ่านมา
VR adult movies generating on the fly depending on your arousal state 🤤
@MadsterV 16 วันที่ผ่านมา
we'll soon be playing with electric sheep
@Exhithronous-y1n 16 วันที่ผ่านมา ⁺⁸
An AI never ending map of GTA would be great.
@BlakeEM 17 วันที่ผ่านมา ⁺¹⁹
I believe that image diffusion models work by stretching vectors and tweaking pixel color values in latent space based on the training data, weights, conditioning (prompt, ControlNet, , etc), CLIP (how it identifies objects/text in the image), and the current noise. If you tell it to make a cat, it will start with noise or an image with noise added. It will stretch image vectors and adjust the pixel RGB values to more closely match the cats in the training data that had the most similar noise pattern with the most similar prompt/terms being used at that specific noise level. This will usually be a blend between multiple images in the training that have a similar noise profile, this is why it makes new cats. It starts with the most commonly seen part, usually a face. It will then add a very specific calculated amount of noise each step, and subtract it based on the CFG scale (more CFG will denoise faster, in effect following the prompt more by setting more groundwork early on) and the denoise amount.
It's much more complicated, but this is how I've came to understand it over the last couple years.
@absolstoryoffiction6615 16 วันที่ผ่านมา
It's an interesting technology but not market viable yet.
Imagine an 8 Ball which can auto generate images on the fly. That's novelty.
But selling pictures that was generated?... ... ... I rather use Google Search Image for the same product but better & free.
@dj007twk 16 วันที่ผ่านมา ⁺²
so more token prediction but jwith image vectors?
@BlakeEM 16 วันที่ผ่านมา ⁺²
@@dj007twk Yeah, basically. Similar to an LLM, just with vectors and RGB values as output predicted off the noise/conditioning input rather than text input. For video, they add a temporal prediction component to predict the change between frames, given the previous frames. This requires another model, such as AnimateDiff. I think they basically trained their own AnimateDiff model on Doom with added parameters for keyboard input. This isn't doing anything new. This is why it's able to keep track of ammo accurately, it's temporally consistent, but not as good with less predictable values that it has less training on.
@luxxart 16 วันที่ผ่านมา
Thanks for sharing your understanding
@absolstoryoffiction6615 15 วันที่ผ่านมา
@@BlakeEM
True
@lyznav9439 17 วันที่ผ่านมา ⁺³³
This is not a video game generation model. It's a video game clip generation model.
@Corbald 16 วันที่ผ่านมา ⁺⁸
That's just a matter of semantics and a control system, really. It's only a clip because nobody is controlling it. I think it was nVidia who relatively recently showed that you could 'hint' a similar engine, though designed to simulate driving, with controlling inputs via a 'plainclothes' prop. In other words, the prop it's drawing it's images _over_ (think image-to-image) steers it's wheels to the right and the AI full on hallucinates an appropriate right turn, replete with terrain and road-markers moving across the field of view. It's really just about keying in the concept of turning, shooting, etc... with some sort of indicator or placeholder, so the engine has something to work with.
Look, I'm not trying to put too fine a point on it, but I'm in my 40s. I've _never_ seen any technology, including internet and cell phones, advance at such a rapid pace. AI is actualizing innovations which _should,_ by all rights, be _decades_ away, but in mere _months._ Furthermore, the naysayers who claim the bubble will pop haven't actually done any real research as to what's being experimented with in academic circles. Reading the papers, there's a real claim to be made that we crossed Singularity with GPT3. Not full on AGI or ASI, but that point of inability to predict the future. We have, in labs, all the pieces to produce an ASI. Full stop. It's just a matter of integration hell, now.
@eSKAone- 16 วันที่ผ่านมา ⁺³
But it's interactive
@jswew12 16 วันที่ผ่านมา ⁺⁴
@@eSKAone-how is it interactive? From what I am seeing, it seems to be a system that can convincingly choose next frame outputs based off of current context and learned patterns, but it does this in raw chunks. You couldn’t just take over for the ai and do any particular action, because it doesn’t actually understand what that action is, it just knows what the next frame should look like. I think in theory there might be a way to train a model to understand how any particular user input would change game state and then output this as the new world, but that doesn’t seem to be what they are doing here. Disclaimer: I am going off of this video and did not read the paper, so if you have more info on this lmk.
@MadsterV 16 วันที่ผ่านมา ⁺⁵
@@jswew12 "GameNGen can interactively simulate the classic game DOOM at over 20 frames per second on a single TPU"
@MadsterV 16 วันที่ผ่านมา ⁺³
it's also flat out said in the video at 0:36 to 0:40
@KaiPhox 16 วันที่ผ่านมา ⁺⁹
Wolfenstein 3D was the first 1st person shooter, released in 1992 2:11
@brycesstuff 11 วันที่ผ่านมา ⁺¹⁰
There were loads of 3D games for pc before doom, and Wolfenstein. For instance, 1983 Star wars, or 1989 MechWarrior. In fact ID software made a game in 1991 that was the predecessor to Wolfenstein 3D, called catacomb 3D. One of my favorites was the D&D game by ssi called eye of the beholder. I think it might have been the first 3D PC game that I played. People mistake a lot and say that Doom was the first 3D game, Doom was the first popular 3D game, and Wolfenstein 3D was the runner-up to that. There weren't enough people in to PCS in the '80s for the early 3D titles to shine. Some of the best PC games were made from the late '80s to the mid '90s. Hands down.
@ThomasMeliWellness 16 วันที่ผ่านมา ⁺⁶
6:15 - Intuitive Diffusion Explanation - I think diffusion models will start to make sense to you once you introduce the idea of blending as a step-by-step process, which learns the patterns in each step and associated them with words in the prompt ("iPhone"). Let’s revisit the blending example with this in mind. In the first step of blending, the components of the iPhone might still be mostly intact, just slightly damaged or bent. Now just go backwards - It’s relatively straightforward to imagine reconstructing the original iPhone from this slightly altered state. In the next step, the phone might be cut a bit-again, you could think about just gluing the parts back together. With each successive step, the damage increases, but simultaneously, the model learns how to reconstruct from a slightly more degraded state back to a more complete form. Each iteration teaches the model both a pattern of "deconstruction" and a pattern of "reconstruction."
I think the main roadblock you were having had to do with trying to imagine how you could reconstruct a full phone from its fully blended components, but this mental model doesn't include within it the learning that occurs at each step to go from fully functioning phone to blended pulp of a phone and back again. If you include that "intelligence" in the process and look at each step as learning representations to go back and forth between the more blended and less blended phone iterations, you can imagine how you could take a phone pulp and reconstruct a phone out of it (at least visually, not materially).
Think of the "noise" or initial state like a set of scattered puzzle pieces that are random enough and numerous enough to form any image we might want if we modify them slightly. We can dip the puzzle pieces in paint to change their color or cut their edges a little more if we really want to. The prompt acts as a guide for assembling, cutting, and coloring these pieces. When the model is effectively trained, it learns how to use the prompt to modify and organize the pieces together, step-by-step, to form a coherent picture. It learned this from going backwards thousands of times and learning "in general" what to do each step.
How was that for an intuitive explanation? I tried to keep all the math out and just give the concept. Does it make more sense how it is possible and how it works?
@KyleCBowman 14 วันที่ผ่านมา
@Wes Roth - good explanation here
@apdurden 17 วันที่ผ่านมา ⁺³²
This is actually a crazy development. I didnt think we'd get this so soon. Doesn't stop at video games. Think about the simple things like applications. If a model learns how to just generate UX/UI without code, AND personalized...then programming probably really is cooked
@TheReferrer72 17 วันที่ผ่านมา ⁺²
Don't be silly a UX is only the interface to the underlying logic.
LLM are much better at writing code than this technique which is hard to verify that is correct, hence they are trying it on games.
@absolstoryoffiction6615 17 วันที่ผ่านมา
@TheReferrer72
In other words... Working Progress and not yet market viable.
Good for devs, but a bad game is still a bad game.
It certainly cuts costs, but I would still hire someone who greatly knows C Sharp or C++, etc. Or, who knows the game engine. Just in case the game breaks.
@TheReferrer72 16 วันที่ผ่านมา ⁺¹
@@absolstoryoffiction6615 No way Devs jobs are secure for at least a couple of years, these tools may enhance their toolset not take jobs.
@absolstoryoffiction6615 16 วันที่ผ่านมา
@TheReferrer72
Correct... Because having experienced employees long term (5+ years) is not common in the gaming industry. Rare but unlikely.
@apdurden 16 วันที่ผ่านมา ⁺²
@TheReferrer72 No, the AI is the UI AND Logic. Wes doesn't mention it but in Matt Berman's vid he points out that the AI does visually keep up with stats like the amount of ammo. Maybe not 100% accurate but this is just the beginning. LLMs will create "world models" of applications, generate your UI and translate your usage/intent direct to compute and storage
@NomadDad 17 วันที่ผ่านมา ⁺¹⁸
Commander Keen! ⛑️
@ChrisSherlock 17 วันที่ผ่านมา ⁺²
You beat me to it..
@NomadDad 17 วันที่ผ่านมา ⁺⁴
ChrisSherlock@@ChrisSherlockgood times, I loved those games
@nyranstanton203 6 วันที่ผ่านมา ⁺¹
John Carmack should get some kind of evolutionary genius make a huge difference award or something. I STILL play DOOM, kids are STILL playing doom, either todays doom or yesterdays doom...lol.
@blengi 16 วันที่ผ่านมา ⁺⁷
I used to paint a bit and have no technical AI chops, but my naïve intuition is diffusion models act a bit like artistic imagination to focus ideas onto images. Similar to looking at a bunch of dots or clouds then conceiving there might be a face in there somewhere, abstract thought processes bias the brain's expectation to actually imagine faces in the noise or clouds. Seems to me the denoising training is just a way to train associations in the latent space of concepts to bias the generative process post training. ie prompting a diffusion model is just priming the salient connections in the latent space concepts prompts allude to, to preferentially manifest the related image representations...
@seudonak 16 วันที่ผ่านมา
I have thought about this a lot previously, and I think you're exactly right.
@MadsterV 16 วันที่ผ่านมา
it's the closest to robot dreams. People think it's just colorful wording, but it really is the closest layman description.
@NA18NA 17 วันที่ผ่านมา ⁺⁴
If you watch 1000 videos of a stickperson being drawn, guaranteed you will learn exactly what it takes to draw a stick person, even if they are in different poses or multiple types of them, now imagine the same thing for everything else you see. This is why diffusion models work
@milesgrooms7343 16 วันที่ผ่านมา
That wouldn’t seem to work for something like playing an instrument. You could “look” like someone who is actually playing an instrument, but you wouldn’t without actually playing and “learning” the instrument.
Unless, you were able to simulate lessons and “playing/learning” an instrument. (Perhaps) that’s what it is doing(?).
Is this the general idea?
@NA18NA 16 วันที่ผ่านมา
@@milesgrooms7343 if you think of it from the perspective of simulating the output, you’ll have the idea. Music generation models “listen” to the sounds being played for them in the same way image models “look” at images. They then try to simulate the sounds based on the prompt they receive. Their goal isn’t to actually play the instrument but to generate a sound that sounds like the output of someone playing the instrument/singing/rapping etc.
@milesgrooms7343 16 วันที่ผ่านมา ⁺¹
@@NA18NA makes sense, I follow. I have not used or played around with any of the generative AIs.
Amazing to me to hear/see what is possible….wondered how it’s doing it though.
@HattiePowell-n5o 11 วันที่ผ่านมา ⁺¹⁶³
I'm favoured, $27K every week! I can now give back to the locals in my community and also support God's work and the church. God bless America.
@AxelBessette 11 วันที่ผ่านมา
You're correct!! I make a lot of money without relying on the government.
Investing in stocks and digital currencies is beneficial at this moment.
@LeonorMacias-h4o 11 วันที่ผ่านมา
I just want to use this opportunity to say a very big thank's to Susan and his Strategy, he changed my life.
@LeonorMacias-h4o 11 วันที่ผ่านมา
Started with 5,000$ and Withdrew profits
89,000$
@LeonorMacias-h4o 11 วันที่ผ่านมา
Susan gave me the autonomy I need to learn at my own pace and ask questions when I need to she's so accommodating.
@VincentLussier 11 วันที่ผ่านมา
I'm glad to write her tay I do hope she will help handle my paycheck properly☺️☺️☺️
@JC-jz6rx 17 วันที่ผ่านมา ⁺²²⁷
Ah yes. Continue attempting to replace things people actually enjoy. Still waiting for an AI to replace something I don’t want to do myself.
Edit: based on the likes it’s seems the majority of people understand what I mean. Based on the replies a small subset don’t.
Yes AI is great. For now it makes life easier. But as humans we lack the forethought to know where it ends. It will only keep moving forward. While right now it’s a “tool that allows none programmers to do video games etc.” it won’t always be that. At some point. It will replace many jobs. When that happens people will be forced into trade jobs that will be over saturated from immigration.
I work professionally, with AI as a software engineer. My current company is firing dozens of people replacing them with stuff I’m working on. This is happening. Get over the “rainbows and unicorns AI can draw for me” phase and see that the longer this exponentially grows. The more likely unintended and unforeseen advances will negatively impact the majority of people. Right now even though all those in the comments still see it as “oh but it helps so many people” just remember for every Billy who’s happy he can generate woody from toy story with two hats, there’s also a father who was replaced by AI who now has to find work in a garbage economy to provide for his children.
I’m genuinely convinced the same people mocking my opinion as an “old person afraid of change” are also the same type of people working a McDonald’s job where AI won’t affect them anyways.
@asyyfjd 16 วันที่ผ่านมา ⁺²⁴
Like earning money? 😂
@theguyonyoutube4826 16 วันที่ผ่านมา ⁺³⁰
It's a tool, instead of waiting for other people to do it, do it yourself
@mAny_oThERSs 16 วันที่ผ่านมา ⁺²⁵
AI didn't replace making games, it improved the process of making games and provided a future alternative for non-programmers to make games, which was previously.completly impossible. Awesome right? It helps game developers with 3d generation and rendering, there are a bunch of new tools that improve game performance and visuals, while making the game developement process faster and all that WHILE also creating the possibility for non-programmers to make games? Amazing right. Alright putting the sarcasm aside, actually name one thing in this video that shows AI replacing something people enjoy. Btw the best video games will always be made with the assistance of AI, not by AI.
@mc9723 16 วันที่ผ่านมา ⁺⁵
Explain to us what you think is being replaced?
@joshs.6155 16 วันที่ผ่านมา ⁺¹⁴
Ah yes. Let me just spend years learning software engineering, get funding, and a team together to then spend years to make a game that only I really want to play. Also, I want a specific niche movie that probably wouldn't make money because it's a weirdly specific niche. So let me spend years writing a script, find out how to make hundreds of millions of dollars to fund it because no one else will and I want specific actors, a crew, and all the other things that go into movie making, then spend years filming it. Super easy! Or I can write a couple prompts and now I have what I have always wanted. AI might completely change these fields and there are negative things but the average person will be able to create things they've always wanted but would never have the means to do. If you like making a game from scratch you still can just like you can now without using a premade game engine.
@silasgreene2479 6 วันที่ผ่านมา
Its honestly kind of poetic the first game this technology is being used on is Doom. Doom, and Quake are pretty much responsible for and inspired most the games we play today.
@BThunder30 13 วันที่ผ่านมา ⁺¹
Commander Keen is a series of video games developed by id Software, starting in 1990. The main character, Billy Blaze, also known as Commander Keen, is an eight-year-old genius who travels through space and battles enemies using his raygun and pogo stick. The first set of episodes, "Invasion of the Vorticons," was released for MS-DOS in 1990. A standalone game called "Keen Dreams" was developed as a prototype for new systems and ideas for future games. It was completed in less than a month while simultaneously working on another game. The series has since gained popularity, with over 80,000 owners of the Keen Dreams release and 200,000 owners of the Commander Keen Complete Pack on Steam.
@AmazingArends 5 วันที่ผ่านมา ⁺¹
Some people say taking human brain cells and forcing them to play doom or do other tasks is a form of slavery because you're taking cells from a sentient being and forcing them to do something that they didn't volunteer to do.
@victorfsaaa 6 วันที่ผ่านมา
Ok, as a neuroscientist, I think that what happens is similar to biological nets:
If you tell to the computer to noise the picture a million times and then you take the "noised" pics to teach him how to reconstruct the pictures, the neurons will learn each little step of denoising. You may not see the picture intuitively, but WE ACTUALLY DO THIS. Want to see? Look a toddle learning how to walk, he have a lot of problems, his arms and legs do not answers adequately at all, they have too much noise and the brain needs to learn how to make the connections more stable and precise. We repeat this thousands, maybe millions of times through life. I just don't know if the non biological networks can "regenerate" while working or it needs a stage of learning like us (we do the learning in the sleep, it is physical reconstruction of the brain)
@LastWordSword 16 วันที่ผ่านมา ⁺¹
Premises:
1. The Loab is a product of several signals dropping off into a common, latent, noisy space, but this space is the "outer bullseye" of 100% Pure Noise, more like 80%. The weak signals produce a messy gestalt.
2. This works from the image end, and produces more noise, 100% pure noise, and then asks for a particular image, cuing an iterative process that *can* produce the image requested.
3. Requesting the same still image continuously causes something like a "reversion to the mean", where the noise gradually creeps in, as no *new*, stronger signal is being generated.Only producing a NEW image renews the signal strength.
4, As these images were formed from noise, they're inherently resistant to not only image noise, but are more intensely integrated into temporal sequences. Halt the temporal sequence, the noise starts to creep in. The momentum is everything. You've got yourself a hyperdimensional flywheel.
@itzhexen0 16 วันที่ผ่านมา ⁺⁴
Maybe whatever traumatized the AI agents in that building is what Ilya saw.
@marasmusine 16 วันที่ผ่านมา ⁺²
This is fascinating. At present it doesn't seem to have a "memory" of the game state: you go through a door, shoot an enemy to leave a corpse, go away and come back to the door, the corpse is gone and the zombie is back? (video clip from 13:04 onwards)
@marasmusine 16 วันที่ผ่านมา ⁺¹
I'm surprised we don't yet have an AI making WADS/levels, that's what I'd like to see (and I don't mean procedural like SLIGE, I mean deep learning)
@WMRhapsodies 16 วันที่ผ่านมา ⁺²
About intuitively understanding Diffusion models, here are a couple of ideas, the second one more general and metaphoric, but way more relatable also:
-1 A Glorified (Intelligent) Denoising Algorithm: a traditional denoising algorithm, knowing nothing about what kind of picture/subject is trying to rebuild, may work by deducing the values of random pixels by other pixels around them. Instead, the Diffusion algorithm knows what kind of picture is trying to rebuild. It will accommodate the random pixel values to a pattern that matches the prompted picture/subject (in which has been previously thoroughly trained). If you can imagine this process successfully removing the noise from a picture with, say, just a 10% of noise, in order to get a perfectly clear picture (much more efficiently than the "dumb" traditional denoising algorithm), then you must remember that the model has also been trained with labelled pairs of pictures for the previous, intermediate, steps for the prompted subject, like from 20% noise to the 10%, 30% to 20…up to the 100% (I made up the exact values, but you get the idea).
-2 An Acute (Artificial) Case of Pareidolia: which is that thing that happens to us humans when we “see” something very familiar on a random configuration, like to see animal shapes in clouds, or JesusChrist’s face on a toasted slice of bread. The Diffusion algorithm has been trained on lots of patterns (they’r imprinted in the weights of the neural network kind of like the patterns of faces or familiar animals are imprinted in our brain) and by prompting we’r asking the IA to extract those very well known patterns from a totally random configuration.
-
I’m a graphic designer that knows nothing about mathematics or programming (plus, I’m not a native english speaker, in case anything sounds funny). These ideas are made up from watching some technical demonstrations and my own experience working with Stable Diffusion, but I think they may be kind of right since they’v helped me to understand the limitations and nuances of generative art. If @Wes Roth himself or anyone tech-savvy can elaborate of them (or correcting/criticize them) it would be great. Cheers.
@seudonak 16 วันที่ผ่านมา ⁺¹
Yeah, I agree. Pareidolia is a sort of decompression, or hallucination of what you want to see in randomness, and is definitely connected to generative A.I. It's crazy how learning about generative A.I. has so many insights into how the human mind works, and possible implications about reality itself.
@ThePedromosca 5 วันที่ผ่านมา
Im 50yo now. I played this game the first time +-1995 with a Sinclair ZX Spectrum 128k in Portugal. And i have it until today!
The game was amazing at the time. Hours and hours playing.. I remember so well as it was yesterday!
@DrEhrfurchtgebietend 16 วันที่ผ่านมา ⁺²
I would be very surprised to find out that the doom that was generated actually represented a consistent world with real physical laws. Although the randomness of it might add something to the game. But for example, the enemies might have somewhat random hit points
@DecentGradient 16 วันที่ผ่านมา ⁺²
Diffusion models start with a noisy image because it's a random blank slate. You could train it to start with a white background or something, but you get better results with noise, similar to having the weights and biases random in the model when you start training. I think of the slow diffusion as similar to chain of thought reasoning. It produces a better image if it slowly steps towards it from the random beginning rather than taking a giant leap. It has to slowly push each pixel to the right RGB values.
That's how I think of them anyway.
@ScottPowersArt 16 วันที่ผ่านมา ⁺²
Imagine reincarnating and you're in Doom on an internet server somewhere unknown and you're only purpose is to wander around endlessly to train AI.
@lordvishnu8172 17 ชั่วโมงที่ผ่านมา
The most impressive thing is Doomguy's hand doesn't appear to contain extra digits. THAT is frightening progress.
@Bigohno0 5 วันที่ผ่านมา
Negative reinforcement = punishment for an incorrect action
Positive reinforcement = praise for a correct action
The noise is a reference point. You have the canvas within which to generate the image, some width by some height, it has two extremes; completely empty/blank where no
Pixel is filled in and full noise where every single Pixel is filled in randomly. In between those two are a whole range of possibilities with pixels filled in with specific values. Filling specific pixels with the right values can give you an image of anything. Scanning over and over again different images of a given type (say cat) and looking at every level of noise for each image, you can map out the rules/patterns for how to alter individual pixels in order to produce that type of image….
@terrylandess6072 5 ชั่วโมงที่ผ่านมา
Pretty sure every tech creator doesn't want an AI reverse engineering their property and have it recreated by AI on modern software/hardware as a moving snapshot of the original. Copy isn't learning, but then living in the world of sequels and reboots - what else should we expect.
@PeteSimon 14 วันที่ผ่านมา
Just to get this out there - "positive reinforcement" is when you add something nice after the person being reinforced does something desirable. "Here's your chocolate, Penny."
"Negative reinforcement" is when you remove something bad after the subject does something desirable. Like that annoying beeping before you buckle your seatbelt; it stops (is removed) when you do the desired thing, buckling your seatbelt. The term for poking someone with a stick when they do something undesirable is "punishment," not "negative reinforcement."
Almost everyone gets this wrong. 🤪
@megatherion4406 2 วันที่ผ่านมา
-What is my purpose?
-You play Doom.
-Oh my god.
@johnjay6370 16 วันที่ผ่านมา ⁺¹
This is AMAZING and this is TODAY!!! Next year it will be 10x better easily!!! great video!!!
@cat...i_think 17 วันที่ผ่านมา ⁺⁴
It's a great day when you see a Wes Roth video :))
@jthompson7175 5 วันที่ผ่านมา
What's wild is I could see how someone training an AI model to play Doom could probably collect a large number of demo files storing playthroughs and use those to teach the model. This kind of took a different turn than I expected.
@tristanchildress2844 4 วันที่ผ่านมา
Gamers: Could you just update Battlefield Bad Company 2 to modern graphics
Industry: WE'RE SPENDING EVERYTHING ON AI GAMES THAT YOU CAN'T EVEN IMAGINE
@ambrose13 17 วันที่ผ่านมา ⁺¹⁰
Wolfenstein 3D released one year prior directly lead to the development of Doom
@andrasbiro3007 16 วันที่ผ่านมา ⁺⁴
Wasn't it the same guy?
@delphicdescant 16 วันที่ผ่านมา ⁺³
@@andrasbiro3007 Yep lol. And Doom is a better game in every way, so I don't know why this was even brought up.
@andrasbiro3007 15 วันที่ผ่านมา ⁺¹
@@delphicdescant
Of course Doom is better, it's the next iteration of the game engine.
@delphicdescant 15 วันที่ผ่านมา
@@andrasbiro3007 Right, that's what I'm saying.
@this_isntmyname 5 วันที่ผ่านมา ⁺¹
This type of research is how the UAC was born
@NaanFungibull 8 วันที่ผ่านมา
Their model was trained on the actual game play footage. No pre-existing game, no ai simulation. We have a ways to go.
@SixtoLuna_art 10 วันที่ผ่านมา ⁺¹
Diffusion model is essentially reverse entropy, which if it’s possible in code would be fascinating to explore how this relates to physical reality
@rmt3589 7 วันที่ผ่านมา ⁺¹
7:00 It's more like taking apart the iPhone, piece by piece. In the AI's case, pixel by pixel. If you do that a few thousand times, you'd be able to build an iPhone from scratch parts. Likewise, after taking apart enough images pixel by pixel, it can rebuild them from pixels.
@AVATARdemon113 4 วันที่ผ่านมา
When entire games can be created from a simple prompt, with extra prompts to tweak the gameplay, design and setting, the industry will die, but the true metaverse / oasis from Ready Player One will be born.
Imagine:
Create a Witcher 3 style game set in a 1:1 scale Middle Earth where you play as a young Aragorn learning to become a legendary warrior and experienced Ranger.
Or
Create a Gears of War style
world War 2 game set in the Battle of Stalingrad with both a German and a Russian campaign. The Narrative’s are true to historical records. Make it epic, horrific and tragic, with smooth gameplay and ultra realistic graphics.
@johnsmitht11 7 วันที่ผ่านมา
This most interesting part of this is how AI naturally learned to avoid something it should do because it's punished so heavily for it.
@markthompson1520 5 วันที่ผ่านมา
0:42 The most underrated statement of the entire video. You're basically hijacking someone else's dreams at that point. That's kinda cool tbh
@captainjpz 5 วันที่ผ่านมา
So... It's like the Matrix?
@bxl2012 17 วันที่ผ่านมา ⁺³
I usually don't click your videos anymore because I do not want to reward clickbait titles. I am glad I made an exception here. I am sure you are using tons of AI tools to help write the script and for the video footage. But the production value of this video - apart from the interesting content - is still insane! Well done.
@papackar 16 วันที่ผ่านมา
Imagine looking at a rock surface and seeing a vague, nonexistent bison in it, due to the natural color and shape of the rock, which are noisy. You then remove a bit of the noise, by scratching or rubbing with charcoal. Now you see a slightly more specific bison. Repeat the process.
@shiBuyaking109 10 วันที่ผ่านมา
The Fact that video games are being used to better AI systems is what’s bizarre because I thought games are bad for us! 🤷‍♂️
@TheBigBlueMarble 15 วันที่ผ่านมา
Denoise is common in photography applications. How it works is pretty simple to describe, in very high level terms...The original image was not random. It had a very specific pattern that humans (and the AI) can recognize as a dog or a cat or whatever. Then you add random noise to it until we can no longer recognize the original pattern. That does not mean the pattern is gone. It is simply hidden in the randomness of the noise. However, the pattern is not completely gone. It is only hidden from our limited human ability to recognize. Not so for an AI. It simply removes the random parts of the image and what is left is the original pattern. Yes, there is quite a bit of image degradation, but it works.
@MikeyDavis 15 วันที่ผ่านมา
Somehow, the way AI works in theory is exactly as I expected it to work, but I can’t understand the math behind it.
The reason AI looks like magic, is because it is accessing the same realm that magic comes from, the space between space and thing.
Anything that accesses that space, no matter how it does it, will produce magic, as the brain is a natural machine designed to convert things from that space into tangible reality expressed through the 5 senses.
Water accesses this very real space as it moves from one state to the next. This is why Veda Austin is able to get water to generate images simply by thinking about it.
You access that space right before you fall asleep, and this is why you are able to generate dreams.
If you can get sand to access this space somehow, you’ll see sand doing things that the brain can interpret as reality.
In my mind, It’s literally so intuitively just the way that latent space works.
@dusandragovic09srb 16 วันที่ผ่านมา
Computer = Mind = Nature ...
This will make people see more clearly.
@Jonathan-rm6kt 14 วันที่ผ่านมา
As impressive as this is, it's still totally intuitive to anyone who has sunk hours (days) into these games. You develop an inner sense of the rules of the game. You know exactly how long an Imps fireball will take to reach you, and can look at a room full of a hundred spawns of demons and instantly assess the relative threat level and form an attack plan that gives you the best chance of survival. If we adopt that generative models can learn anything that the average person can through repetition, it makes sense that it can not only learn to play the game, but to *become the game*
@Otherlevel51 17 วันที่ผ่านมา ⁺²
Eventually AI will be able to generate classic games with today's graphics and functionality
@absolstoryoffiction6615 16 วันที่ผ่านมา ⁺¹
It's a good start. It means people can design games without spending a fortune for the next 5 years.
Unless of course... Companies and Governments begin to rot it, as always.
@Otherlevel51 16 วันที่ผ่านมา ⁺¹
@@absolstoryoffiction6615 they gota protect there monopolies.
They probably make barriers to entry expensive
@OscarTheStrategist 16 วันที่ผ่านมา ⁺¹
Can’t believe number theory has lead to this. Amazing 🔥
@glamdrag 16 วันที่ผ่านมา ⁺¹
You film yourself dissembling phones for thousands of hours. Then you reverse all the videos, and you train an AI on how to assemble an iPhone with that reversed footage.
That would be a closer analogy since they train the AI on the reverse process of noise to image. They add noise, then they reverse, then they train on the reversed "de-noising" process.
@absolstoryoffiction6615 16 วันที่ผ่านมา
True... Do the same work motion over and over until you get it to a tea. Instead of throwing random data which messes up the learning process.
@Ilearnedtodayy วันที่ผ่านมา
The day ai can make open world games is the last day I see daylight.
@centerfield6339 7 วันที่ผ่านมา
They aren't dreaming up the world. They're mostly coding one specific game's look (and even - one player's run through a game) into an AI. It doesn't invent games.
@lancemarchetti8673 12 วันที่ผ่านมา
Building your own Doom maps is tons of fun and a great skill to learn. AI needs to stand a back bit and stop thinking what humans 'need'
@Rbyn 15 วันที่ผ่านมา
FASCINATING!!!! I'm not even a gamer and this video fascinates, also love the side story about putting Doom on tiny screens - I had no idea
@cferrarini 6 วันที่ผ่านมา
In the future games will never be obsolete, because graphics and levels will be generated each gameplay...
@Paraselene_Tao 6 วันที่ผ่านมา
I've been telling folks for about 2 years now that AI-genned games are coming. This is an example of it. We're going to see more games like this soon. Perhaps a prompt will guide the story and game mechanics to a degree, but AI-genned games may be fairly unique during each playthrough. The story and mechanics will develop around how the player plays the game, and the AI engine will be a kind of Dungeon Master or Game Master who develops the game as the player interacts with the game.
Interstingly enough, this might lower requirements for hard drive space on folks' computers. It might also provide higher detail audiovisuals for roughly the same processing power because it's all being made or buffered by the AI server. Maybe two bad things about this are that we will need solid internet connections and we will likely get charged monthly subs for the games. Until folks can easily store their own AIs at home, I guess we will be connected to servers where the AI generates the games. Unfortunately, I can see game devs charging monthly subscriptions for games like this. I hope small, powerful AI game engines become a thing soon so we can play the game w/o internet connection or monthly subs.
@copesettic8791 14 วันที่ผ่านมา
Yes, I'm old enough and nerdy enough to remember Commander Keen. Great game.
@thomasb.3833 17 วันที่ผ่านมา ⁺⁶
ASMR voice threatening 😂👍
@Defeye 14 วันที่ผ่านมา
Bro that quick shot of Commander Keen!! I totally remember playing that game. Havent thought about that in years😂
@ivanernestomujicarios8647 3 วันที่ผ่านมา
6:59: This method won't work for us because we don't have perfect memory. The AI does think about how drawing works for us. You need to remember the shape of the object where a line goes a long it is and so on.
@zalzalahbuttsaab 16 วันที่ผ่านมา ⁺¹
6:50 I'm not an expert in any respect regarding how a neural net works on images but I do know that for each data point that must be learned, there must be a corresponding weight in the model in at least one layer. This doesn't take into account clever methods employed by scientists to reduce computation time. So, using this schema, the output determines the weights required and by the process of backwards propagation, a neural net during learning adjusts its weights so they match the input. In the case of an image with a resolution of 1024 X 786, in terms of the crude schema that I have proposed, the model would require 1024 X 786 weights. Once the weights are established across all the inputs for a given phenomenon, e.g., "cat", then the model can be put to work generating a cat from the noise from grosser to increasingly finer gradations at each point in the available matrix according to the available weights. The choice of gradations would be based on the input words, e.g., a cat sat on Wesley's laptop computer keyboard while he's filming another shocking TH-cam video", such that the various phenomena would be integrated together into the weights as I imagine some sort of vector numerical operation. That's my best guess for what it's worth.
@TheSamucacs 8 วันที่ผ่านมา
The thing is that people don't recognize because they're not familiarized with it. It's like when we thought that ps2 graphics couldn't get better, but now we see flaws in ps5 graphics.
@caiolimacaldas 11 วันที่ผ่านมา
Excessive bureaucracy is one of the major issues leading to the decline of the gaming industry, and AI could potentially address this problem.
@thegringoscottproductions1699 16 วันที่ผ่านมา
I think that dog Cartman thing is actually positive reinforcement. You are adding something versus taking something away to reinforce a desired behavior...
@enlilofnippur8409 3 วันที่ผ่านมา
The problem with generative AI is that it does a better job creating a convincing fake than anything authentic and quality.
@dr.catherineelizabethhalse1820 14 วันที่ผ่านมา
The dramatic event which traumaticed me was accidentally depositing my hitmonlee losing it forever.
@moo-snuckle 4 วันที่ผ่านมา
And here I was impressed with Diablo dungeons being procedurally generated each playthrough...
@SL3DApps 16 วันที่ผ่านมา ⁺¹
Worthless tech until we have something better than diffusion models that doesn’t hallucinate and can reason about generating content based on rules.
@P8860 16 วันที่ผ่านมา
it's coming no doubt about that
@snekoyl 15 วันที่ผ่านมา
Watch this become a DRM method. Instead of allowing downloads of your actual code and assets, you could train a model on your game and only release access to the AI model. Now no one can hack, mod, backwards engineer, etc.
@Windswept7 15 วันที่ผ่านมา
Since everything is entropy and noise is an analogue for entropy,
all you need to do is reverse entropy by expending energy and that's how I comprehend it with my limited understanding.
With an advanced enough (atomic/molecular) AI it could theoretically transform any matter into any matter with sufficient energy/time.
@Appalachistan_store 6 วันที่ผ่านมา
The diffusion model is basically what Willy Wonka invented to teleport his chocolate bars. 😅
@JoeyNTasha 6 วันที่ผ่านมา
The part @6:00 really has me stumped. Not in the sense that I can't understand it, but that I can't explain it, and I'm usually pretty good at that. The best I can say is that the AI first learns how to convert an image to a distorted image. Then, it learns how to read that distorted image and convert it back to the original image. It is then fed an extremely large database of images so that it can find a pattern and use it to learn to 'create' an image on its own. I hope that makes sense. 😂
@abehemothbeast866 3 วันที่ผ่านมา
Computer games are the modern day puzzle to be read/listened and enjoyed, thank you dude :)
@LastWordSword 16 วันที่ผ่านมา
You had me at "positive reinforcement", but the "ASMR voice" was pure joy!
@jkpesonen 16 วันที่ผ่านมา ⁺¹
ChatGTP rough explanation for image diffusion models: An Intuitive AnalogyThink of it like sculpting a statue from a block of marble. Initially, the marble is just a rough shape (like noise). With each chisel stroke (denoising step), the sculptor refines the shape until a clear statue (image) emerges. The diffusion model is like a sculptor that knows how to chisel away noise to reveal the image underneath.
@papackar 16 วันที่ผ่านมา
Of all the possible ways to remove noise from an image, some are more consistent than others with those arrangements of pixels that the network has learned are associated with a label of “cat”. At each step, the network prefers that pathway of maximum consistency.
Noise removal in the direction of “cat”.
@thetruthexperiment 10 วันที่ผ่านมา
Collecting data on how well humans can tell the difference was a waste of time. You can just see that it’s good. Freakishly good.
@joaoguerreiro9403 16 วันที่ผ่านมา ⁺¹
Game devs will now have to develop AIs that can simulate the environment they intend, while still having to manage the game mechanics and story. Must me really cool to learn all that :)
@makanaima 13 วันที่ผ่านมา
You can write 3D graphics to run on a CPU w/o a graphics card and before OpenGL, they taught courses on this at my university during my CompSci Degree. It’s not easy though. The class was all about using math to do 3D graphics. You had to write your own code draw basic primitives like lines, circles, etc. and then write the code to draw a 3D scene and then rotate it, move through it, etc. That is all linear algebra math for transforming and projecting through a viewport. Not impossible, but not a walk in the part either. Carmak is definitely a smart dude to pull this off. Particularly b/c getting this to run in a performant way w/o a graphics card would be a feat just in and of itself.
@koincidental 8 วันที่ผ่านมา
Brilliant! One of the best youtube videos I have seen in a while.
Nice voice man, nice presentation.... Noice.
@colourberry 9 วันที่ผ่านมา
The bit about noise is simply a pattern recognition process. Like bluring a camera and the as you unblur the lens what the image could be becomes more obvious. A filter basically.
@jimmyjoe1488 16 วันที่ผ่านมา
The response time of the AI Doom to your key presses would be insanely high. If the reviewers had actually played the game, there'd be no confusion.
@nifftbatuff676 13 วันที่ผ่านมา
I woll be surprised if the gaming industry won't be dead in the next few years.
@BurningCrusader 2 วันที่ผ่านมา
I think it would be cool to see AI trained to play computer players on some real time strategy games.
@seudonak 16 วันที่ผ่านมา
Regarding how the denoising in the diffusion model works, I think of it similar to an extreme form of compression. You can take a song in .wav format, and with a good compression algorithm, compress it way down to an .mp3, then in real time, as it is playing, the algorithm tries to recreate the original. If instead, you took input from many songs and compressed them, then added some flexibility in the decompression, you have a music generator. The brain of human artists and musicians works in a similar way. A great artist has seen tens of thousands of drawings, paintings, and reference images, and have arranged their neurons to be able to hallucinate various images, so when you ask them to draw a horse, they can hallucinate an image of a horse in their mind and put it on paper. Same with a musician writing musical notes. They hallucinate a song in their mind using decompression of the compressed imprint of all the music they've heard.
@McJiver 13 วันที่ผ่านมา
Amazing that Doom is being used once again in pioneering tech.
@Slaci-vl2io 17 วันที่ผ่านมา ⁺¹
6:59 I think at every iteration it is told to make it more cat-like. It starts with complete noise, for the sake of originality. Humans would start with a blank paper.
@any1alive 17 วันที่ผ่านมา ⁺²
humans work with whtie sheet of paper slowly adding colors, ai works from random noise slowly addign and removign colors,
but heres the thing, the ai can work from white page also, adding colors till it gets an image too,
humans will say a cat is this shape so usualyl does a outline, then work on next feature one or 2 at a time kinda
but ai will often think a cat is this shape or idea and shade in the entire cat in rough all at once, kinda outline and fill at the same time not concentrating on one detail at a time, but a tiny bit of the whole image
like our minds if we are given 1 second to imagine a cat, vs 20 secodns to imagine a cat itd get mroe detailed isntantly, just we are slower at puttign it form mind to paper
@absolstoryoffiction6615 16 วันที่ผ่านมา ⁺¹
@@any1alive
Also... The AI has no frame of reference. While humans can simply find a cat picture.
Which is why every new iteration of AI has to begin from the ground up from a clean Data Base, as not all iterations can use the same Data Base. While humans can always access a cat picture.
Memory Loss is a big issue with LLMs. They can create new art and solve problems. But they can not replicate those solutions. Always having to start from step 1, then to step 2.
@Slaci-vl2io 16 วันที่ผ่านมา ⁺²
Another key idea here might be Divide et impera. I guess telling the AI to turn complete noise into a cat, will fail, need to much to focus on all at once. Telling it to do it by iteration, just make it slightly more cat-like for now, can make it happen. That's why soetimes it ends up with five paws: The fifth one was intended to be a shadow or anything in the background, but the next iteration redefines its purpose, this happens repeatedly, and at some point it looks too paw-like to change it back into a shadow.
Why noise: Turning something into something else seems to be more feasible for AI than create from nothing. It's easy to predict what you'll type when there is a previous word. It's easier to predict what changes we should make to a noisy picture to correct it rather than adding things to an empty area. A cloudy sky helps you imagine shapes and animals, and a bit of photoshopping can turn it into actual animals. But no clouds -> no idea.
AI probably watched how humans draw but that wasn't helping it to learn image composition. Turning a more noisy image into a less noisy one is a good approach.
@any1alive 16 วันที่ผ่านมา
@@absolstoryoffiction6615 yep pretty much its easier to fed a human than a ai, if we see a wierd lookign cat itll seeem wierd for a few secodns then yep thats a cat, but if the ai sees a wierd cat, its not jsut a few secodns to learn and adapt its net as its usually pretrty fixed
and you can only store so much data in so many points/nodes/neurons or brances, data cramming, thats where data progrgramming and other scientific fields come in, like how many numbers can you count with 4 fingers, , well 4, but you can also do 8 if you use them to count in binary, or 81 if you count in trinary or 3 per digit, , or in 4's 3 digits plus it beign down as 0, you could count to 256 all that with just 4 fingers, but its a matter of trying to remember or store more than that or you cant kinda without cramming or stuff i went off on a tangent there and forgot my point i was gonna make lol
but yeah, its hard for you to store more information in a ertain amoutn of points, thats why larger models can fit more information kinda and its so big when a small model is efficently trained and works good
@any1alive 16 วันที่ผ่านมา ⁺¹
@@Slaci-vl2io yeah, thats usually the steps or iterations, in some image generators, how many times to do the loop
vs 1 hard big step, cut ti up into many small ones
@jacob.developer 15 วันที่ผ่านมา
@WesRoth Generative image diffusion with noise basically works in the same way that humans look at clouds and imagine they see familiar shapes. In real life, clouds are really just a bunch of random noise, but humans look for patterns, so when it starts to resemble something similar to what a human has seen before, the human starts to think they see the clouds shaped like that object. Diffusion models work in the same way, where computers are taught to try to see shapes in random noise, and basically try to imagine that they see something in the noise.
@zerazara 3 วันที่ผ่านมา
Video games that are dreaming and hallucinating does however sound strange...

ต่อไป

เล่นอัตโนมัติ

Amazon's LEAKED Conversation Reveals Stunning Truth About The Future Of Software Engineering