Ouch - every time! There are some other notebooks and links in the video description, please try those too! And if any of you Fellow Scholars found an alternative website for easier access, please let me know and I will add a link to the video description. Let the experiments begin!
@@TwoMinutePapers haha crashed from only 3,00 views cant wait to see how they handle the next 150K viewers. better buckle up! ps love your vids and have been watching for years now ive been messing with dall-e and the others quite a bit and wondered what your thoughts are on midjourney as you havent mentioned them yet.
Won't be long before the model learns camera position in it's weights so we can synthesise 360° image sequences of same scene/subject. Feed it in NeRF or photogrammetry software and create 3d models. Can't wait to go from text prompt to 3d scenes with easy editing capabilities.
Probably do this with a different model attached. I've already done some playing around with this myself: use blender to set up a basic 3d scene with slightly mottled/noisy textures and rough models, then feed it into Stable Diffusion. Stable Diffusion genuinely does not understand 3d space at all, and it doesn't have access to the tools needed to do so. It really just emulates an understanding of 3d space and lighting entirely based on forms and composition. But it really, really breaks down in a lot of large architectural works, where it frequently produces impossible geometry, or often just in basic anatomy. What we really need is a model which generates a simple 3d scene based on natural language, then lets us choose a camera angle, then feeds that 3d data into another model which textures the visible polygons, then feeds it into something like Stable Diffusion that stylizes it. That would give you really, really good human-like art, and make animation actually coherent and quite manageable.
There's a whole world of "Holodeck" applications waiting to be made by gluing together different models and slapping on some UI. I've already seen a few experiments with feeding GPT3 content into Midjourney and SD. Meanwhile, here I am, learning to draw; it doesn't feel fruitless, because being able to sketch exactly what I want is communicatively useful even if I should end up running it through an AI. My brain can access way fewer shapes than the AI, but it can also apply them more precisely.
This field is improving so fast now I remember when 1 or 2 years back two minute paper was presenting GAN that drew small pictures of bird ... And now transformers changed the game with 100% usable generated concept art suitable for 200 million $ movies.
Simple animation is already possible with this technology. I'm absolutely mind blown. I thought it would have taken so many more years before it was possible but here we are.
100%. Generative AI has accelerated to light speed progress since 2017, and basically every major cutting-edge model incorporates transformers in some way. I wonder if Vaswani and company knew how much their paper would change the world?
Just here to say that no, these wouldn't be used in $200m productions (at least not yet, and I doubt anytime soon either). This is not production ready yet. It lacks 'coherency'. What I mean by that is that by the time it takes for you to create even two images that 'exist within the same world', a concept artist will have produced MANY more. I get what people are saying about being able to describe what you want out of these AI, but realistically, think about that. It's easy to describe the general idea of something (which would get you generic fantasy scape #16455). But it is VERY hard to specifically describe Minas Morgul. To caveat, the img2img makes this MORE feasible, but still not better than a concept artist. What I DO see it being used for in production is mood boards.
@@pneumonoultramicroscopicsi4065 Im all up for Open Ai being a profit company. But, please deliver really great products. At this rate, the free version is better! And I'm scare that open AI , not being able to compete...will start using their lawyers to stop other from publishing and sharing ( like the pharma industry)
Also ironic because a large portion of the human population still thinks art is a magical skill that some people are born with. Literally the opposite, its a set of visual rules that are thought.
Someone just developed a plug-in "renderer" for Blender -- it takes the viewport image (usually a textured image but one without proper lighting) and renders it in a fraction of the time it would take Cycles to do.
I have been thinking about this stuff a lot, especially with how good real-time renderers have become recently. Just enough fakery to make and AI’s job quite easy at going from “game graphics” to “photorealistic” as a post process. With all the 3D info on the scene it should be able to do an amazing job. Ray tracing might go extinct very quickly.
This is a true turning point in AI and I'm so excited. I think when companies see that the world doesn't implode by releasing an AI model to the public, they will be more willing to do so. I also think none of them wanted to be the first, and open themselves up to some perceived social liability. Now that the cat's out of the bag, I think we'll start to see some even cooler stuff released publicly.
On the other hand, knowing the internet, I'm sure there will be plenty of content generated with Stable Diffusion that might have big companies thinking, "Not worth the liability, we'll let Stable deal with that".... or maybe I'm just cynical about people in general.
I feel like it was glossed over a bit in the video itself (though it is shown visually at the very end), but one of the cooler tools to come along with Stable Diffusion is img2img. You give it a rough image and a prompt and it creates off of a combination of the two. So you can turn an MS paint image with some different colored areas into a beautiful landscape. There's a lot of options to in terms of strength of the effect, applying it multiple times, etc. This gives you a lot more control over how the ending image will look than just using a text prompt. There's a simple free version called Diffuse The Rest where you can try out the concept.
There are at least two ways to do this: Stable Diffusion itself uses an image autoencoder but the text prompt is also passed through a CLIP text embedding model which also comes with a matching image embedding model. That's to say you can use the example images either as starting points or as prompts.
I just love that this massive beautiful beast of a model and source code was released in full on those democratic terms so that Adobe won't get to colonize the AI imaging space and put it behind their subscription paywall. Blender for example already has a working integration.
@@ericray7173 Colonize works fine here if used as an analogy (reading between the lines, it can be especially hard if you are not a native). Establishing areal control by way of being early with significant resources to fight competition yet within a framework the original commenter does not believe can be monopolized.
I think most AI models have been open sourced since many years ago. I've downloaded many different types over the years. Sadly, that becomes a lot harder as they get more interesting and useful, and more resource intensive and heavier.
My mind was blown when I first discovered how far AI image generation technology has come. Almost instantly, my mind was blown again thinking about the future of this technology, animation, 3-dimensional rendering, even music/audio and how it could tie into "metaverse" technologies. Imagine an empty room, then you say "Blue forest with fireflies and golden streams with a mozart vs beethoven piano battle in the near distance" and boom you experiencing that virtual reality. Just contemplating that potential leaves my mind in a state of being perpetually blown.
Animation will be tricky considering content is copyrighted and for personal use only if we're talking about realistic 3d animation that'll be easy, but 2d animation the only way to get training data is of off pirate sites also you'll have to caption each and every frame of a 21 minute episode and also compress that information down we're a good 2 to 3 years before good image to text animation comes out we more have to conqure video as that'll make much easier to create fully 2d animation since you'll only combine it with an image generation to base the artstyle of off!
@@samuelkibunda6960 I don't think animation would be any different than the current image AIs actually. They don't just use public domain images, but use a variety of images out there. Their argument is that you can't copywrite a style, and there's nothing to prevent a human from studying any images online and having that influence their style. So you could just as easily have an animation AI watch movies, and use that to learn from. If you use copywrited characters in your resulting animation and try to sell it, of course they can go after you if it isn't fair use like commentary. But you could still make animations with characters that are modified enough (like combining them with some other character) to make that not be an issue.
@@ShawnFumo yes, and especially how we've already seen AIs giving more realistic animations for characters from both video and/or mocap data. Literally the only reason something like that hasn't been done yet is just someone hasn't done it yet. We have the horsepower and all the pieces, it just hasn't been done yet. All 2d is, is a style/interpretation of 3d/real life. Just need to translate that. I know easier said than done, but this is a path that has been traveled now and someone will soon do it for this particular goal.
Using the absolute basic stable diffusion model through Anaconda on an rtx3060 12gb, it takes about 8-12 seconds to put out a 512x512 pixel image. It takes about 15-22 seconds for a 512x768 or vice versa. You can literally run it for 1500 images, go to work, then just come home and find your favorite ones to play with more. Also, note, if you use the words "dream" and "fantasy" in the same prompt to try to make a landscape, you're going to get some Disney style castles and text on a lot of them.
Simply mind blowing. First Dall-E 2 came then a sudden boom of image generation AI's then stable diffusion getting up there with the quality of Dall-E 2 but in a browser for everybody to use Edit: I know various companies were researching this for some years but would it 2022 when they start releasing them.
These companies have been researching for over 3 years now , the only thing Open AI tried to do was that they tried to come out to the market with a good model before the other companies for the sweet sweet cash and they succeeded , but little did they know that someone would come and offer the same thing for free haha
here's the thing, I've always been fascinated about the way things appear and disappear in dreams, and how they seem like they've always been there, or how scenes completely change without confusing you in the dream. Every time I've seen an AI convert one image into another, or create an image iteratively, etc, it always captures that feeling perfectly. I've wanted to try and recreate that visual but haven't known where, to start, and seeing 3:16 was so exciting. Imagine knowing what you want to appear or change in a scene, having the AI interpolate a rough before and after (being able to tweak both to perfection), and using that as a framework to create the eerily smooth transition! That's just one extremely specific use case as well, the possibilities for this are basically endless!
It's as I predicted, at first everyone desperately wanted to greedily keep AI to themselves and not allow people to run it on their own computers. They wanted to print money by forcing people into subscriptions. Need more people willing to spill the beans. These AI models were built off the sum creativity of humanity. AI art belongs to everyone. Subscription based services like Midjourney won't last.
Don't try to paint Midjourney as the bad guy here. Their cheap subscription-based model was always fairly priced. And, even now, it offers less tech-savvy end-users a powerful set of options for variation. I find the features rather lacking in comparison to what's being developed for SD. But if MJ can keep up the pace of its development, then it'll be just fine. On the other hand, Dall-E 2's insultingly overpriced 13¢-per-prompt payment model has been smashed to pieces, set on fire, and thrown in the dumpster where its always belonged. It now offers nothing that a $10 monthly Google Colab subscription can't provide. OpenAI sacrificed what little reputation they had left, in exchange for ~6 weeks of bilking their closed-beta users.
@@CheshireCad I agree. MidJourney at least is fairly open about a lot of things, running polls with their users constantly, having live discussions on Discord with their "office hours", letting everyone see mostly everyone else's prompts, etc. And I believe they've collaborated with Stable Diffusion on some of their recent experiments like test/testp (which are pretty amazing). But yeah, OpenAI better release a waaaaay better model soon or change their prices or else they'll be left in the dust very soon.
@@CheshireCad Not even the first time they've done so. They similarly overcharged out the *** for access to GPT-3, when you could get similar results from open source models like GPT-J, Fairseq and then NeoX for much, much less. OpenAI have always priced their generations at 1000%+ profit margins.
I can hear Nvidia breathing a sigh of relief. Finally there is a use case for all those GPUs they still want to sell after the crypto stupidity comes to an end.
Roughly a week before Ethereum flips to proof of stake and frees up 0.5% of global energy use that we can put towards art instead of money.. Who said the utopia.wont be full of artists
Not a good comparison. Unless you are generating hundreds of images per day, you dont need a dedicated GPU card. In the other hand, crypto mining uses the GPU 24 h a day.
@@chemariz you do actually, SD generation using complex features and procedures for creating actually usable images becomes a GPU hog that necessitates having a 3080ti.
I love that stable diffusion is not only free but seem more competent than dall-e at a bunch of tasks... Two more papers down the line I'll be crying tears of joy
After the time the weights leaked I've been writing a guide for running SD locally (in my native lang, not English). It took me two days - and by that time the OSS community had already almost made the guide obsolete. They had new features, more efficient VRAM usage, more hardware support, everything. The pace of progress since it was released is staggering
"Get exactly what you asked for" - Said by no one who asked for something specific. Don't get me wrong, I LOVE generative art and use it heavily as a tool in my own art, but the public idea of what these things do is HEAVILY skewed by seeing the good results and not the bad. I can often spend DAYS perfecting my prompts to generate various images that will then all be combined to produce a final piece.
I'm very excited for the democratization of such a powerful AI. The results of the public's access to previous, closed source image generation AI has already been great, and I expect it will get even better with the release of this, and the new options it brings. I'm also excited for how it might affect other companies' decisions on releasing their AIs. Also, I think the blending between generated images over time looks really cool and I can't wait to see what people make with it.
What these programs are capable of is absolutely amazing and it seems like there is no limit to how good they can become. There is no denying that this is revolutionary and it's not slowing down or going away. But as an illustrator, this is making me feel very depressed and so so hollow. I was always excited about tech that helps artists, for example, clip studios coloring ai, which can help greatly in the rendering process of illustration. While AI is an amazing tool for concepts and references for artists, it seems that in just a few years if not months it can advance so much that it can more or less cut out a living artist from the creation equation. After all, what would be the point to hire a skilled artist to create something if AI can get your idea 99% exactly how you wanted? Also, please be respectful. I've seen many people on the net telling artists to shut up and cope. It makes sense most of the art community is angry and scared, millions very well might lose jobs. We just want some respect which we repeatedly don't get. We already often have to deal with people telling us that art is so easy for us because we are born with some amazing talent ( surprise we are not, it's just hard work and studies). And nightmare clients, for example, I had a guy get angry at me because I refused to illustrate his 32-page comic for 300$ (which would take me close to 300 hours to finish). He also told me he can pay me 25$ for the character design sheet but I will have to give him the money back at the end of the project? lol, what. The founder of Stability AI Emad Mostaque also said in an interview that "illustration design jobs are very tedious'. It's not about being artistic, you are a tool". I truly thought that I was at least a bit more than a tool, but I guess not huh. I think I speak on behalf of many artists when I say that being called a tool sounds very insulting. I know AI is not replacing me just yet, but in a couple of years? Who knows. Becoming a paid illustrator was hard work. Years of waking up at 5am before everyone else and practicing. Working late into the night. As a little kid I had very weak health, and still do, so drawing was everything I really know. Finally becoming a working illustrator was like a dream coming true, happier than ever, and couldn't believe I made it! So now seeing how good AI became is impressive, but it also feels very depressing. Feels like all the work and learning I have done till now gonna be all for nothing. And what's the point of refining my skills if I am becoming just a tool and there is a better one out there? Sorry for the really long rant, but I just have a hard time coping with all new doubts about the future. (and I will most likely implement AI into my work, I think it can be amazing as a mood board and reference or even texture generator. But all the fears still stand. And I'm also starting to feel very pressured into using AI, it seems like that's the only way to make sure I can still have my creative job a few years into the future).
When the first calculators came out the mathematicians were afraid to loose their job aswell, now they program the calculators. i suggest you the movie "Hidden Figures" in my opinion it's the same concept
@@parallelworlds1248 Very different. Not only STEM fields have always been well respected, but a mathematician's job isn't to calculate only. An artist's job is to paint only. Which these IAs replicate greatly. Additionally while calculators have been programmed, these IAs are trained on data sets of unwilling volunteers. IA raises way more ethical points that people are just willing to overlook because it benefits them.
Time to scrap my dream huh. . . Working as freelancer artist has always been hard. The competition is always fierce. But when you have to compete against machine that can spew result in minutes, can be asked to redo works infinite times, free, and create a good result. Yeh, rip. I myself just started. But already meet the end. This job itself already meets its dead end. The next generation artists would probably just become AI result tweakers.
@@lenOwOo I'm a freelance illustrator. I generally don't recommend the arts unless you can't see yourself doing anything else but art. Right now though it's hard to really ask any beginner to invest time in something that might well be obsolete by they time they are skilled enough. You can always try out going for 3D work since that might take a bit more time until it becomes obsolete but that's a huge risk you're taking.
An amazing application for this is the generation of assets for video games. Just generate a few textures from a textual description and there are already pretty good algorithms to generate normalmaps and so on from the texture. Just choose the one you like. Could be huge for modders as well.
Yeah I have experimented a bit with this and it works spectacularly. I still haven't worked out tiling the textures but have some very nice wooden textures that I can use without worrying about copyright. @Polyfjord has a tutorial on how to upscale the images as well.
It's really amazing what can be done nowadays.. The most impressive is the pace at which it progress and not owned by some multi billion company willing to make you pay outrageous prices for it.. But instead we get free amazing content and it's improving daily at this rate.. I can't imagine how good it will be in a single year, but I can already imagine what it will be in a couple of years.. Real time generation of images with iteration process, basically turning an image generator into a video generator.. Can you imagine that? I know it's not that far off since we already have some "decent" tools for video editing that can remove and replaced parts of a video to mask things on its own without having to rotoscope it out frame by frame etc.. I really can't wait to see what we'll be able to generate with some creativity
Emad deserves huge credit for funding this and making it Open source - the first to do so afaik, while Big Tech greedily hoard their secrets and only release tech demo's to profiteer and brag, all of it heavily filtered and censored of course.
Just imagine using an AI to come up with characters and scenery and then we get to choose the one we like and use another AI to make these 2D images into 3D ones with just a click of a button. Then you can extend that World with the another program just by clicking another button. So many possibilities, literally endless! What a time to be alive indeed!
It's immensely disappointing how OpenAI has a name that would lead one to believe that they are charitable and OPEN, when most of the time, their work is proprietary and only accessible via a paywalled web API. They went from OpenAI to "open for business."
This would be cool for 3d modeling and map making in video games, imagine how much time it could save by asking an AI to make a variation of characters instead of having to do everything from scratch? Or make a new map and try variations until it's close then just tweak things here and there. I have this deep intuition that we have no idea just how powerful AI is going to be for almost everything.
@@liorbeaugendre6935 a game where you tell the AI to pretend to be a human and interact with people to decieve them into think it is conscious and by such means accidentally creating an actually conscious AI and oh nonoonn
Ah yeah, erradicating the creative process of character design truly sounds like an awesome idea. How can you all say shit like this without realizing that this is just another step towards the automated grey dystopia
Cool idea but a lil too far fetched for now. I work in blender and other 3d focused software (substance, etc) n started using unreal for a few projects. Conceptually and it does sound cool when u say it out loud but execution wise from rigging to retargeting and animating (not even talking about polycount and other optimizations), we're wayyyy far off from being able to do that. All I see rn is concept artist needing to use this as part of their toolset to work with the ai so that clients or their bosses don't get the dumb idea that they can be fully replaced entirely.
What I'm waiting for is an AI that can keep continuity between iterations. As it stands, trying to animate with AI is tricky because in each frame, the details can change their shape slightly. If I'm happy with a generated face, for example, I could be able to pick that face, "lock" the details in place, and be able to generate more details around it.
Totally. You're right that right now it's almost defined by how constantly-changing the creations are. I'd love to see the option to lock in certain aspects.
As an artist I have been so depressed about AI the last couple months that I stopped watching your videos, Karoly. "Art" was going to be gatekeeped by monopolized exclusionary corporate giants with no way for anyone else to compete... And the 1% would decide what humanity's art should look like based on their corporate, ideological, political agenda. But last weekend I downloaded SD onto my PC and I've been playing around with it since, and all my enthusiasm and love is back again! 🥰 This tool is absolutely amazing. Beyond words. I can iterate and experiment and work toward any vision I want so easily and quickly. What used to take weeks I can get done in days. And the details it sometimes dreams up are so surprising and inspiring. This is, without a doubt, the greatest age any artist could hope to be born into. I thought that was true just because of Internet image references, but this trumps everything. I can't wait for it to get better. (Well, I also know a bit about AI and coding, so I'm not exactly waiting. I can do a little.) This is a day to celebrate.
It’s important to distinguish been art as a job creating content, and art for arts sake. “AI” such as this, ( algorithmic procedural mass copyright theft) can replace the jobs most artists have, by churning out infinite cheap content. But it isnt actually AI, and it can’t create anything. Art is still in the hands of the artist, weather they chose to employ any AI in the process is largely irrelevant, just another tool in the chest of self expression.
@@michaellillis9897 It will replace careers. Just like a "computer" used to be a human who did computations. But I've thought about this very carefully, and I'm 100% sure this isn't theft. Do you think Stable Diffusion is copy-pasting? It absolutely is not. You can ask it for a tiki mug that looks like Thanos in a Van-Gogh style painting, and it can give it to you. In dozens of variations. In seconds. How? Nothing remotely like that has ever existed in human history. Where are the images this program pasted together? Where is the magazine or blog that posted the image first? No. It decided on colors, and composition, and lighting, and set the scene, and rendered it in a style it understood. It did everything except originate the idea. Consider this: If you painted a Thanos tiki mug in Van Gogh style, would you Google references first? Would anyone deserve payment because you glanced at their art in a Google scroll? We, the human artists, are the ones using art as "reference" and "inspiration" left and right. It's how we operate. This software does it from memory. It's been tested, so we know for a fact it _can't_ reproduce copies of the images it trained on (except in rare cases like memes that it saw thousands of times). It's not a database; there are no pictures hiding under its hood; only abstract concepts and raw skill. Copyright law guarantees that anyone can use a copyrighted work for learning and education, because that's morally right. That's what we did. We learned from masters past and present, filled notebooks with "master studies", and stared endlessly at amazing skills we hoped to someday achieve. Study is not theft, and this program studied. It is something that has never existed before: A machine that fundamentally understands the visual representations of abstract concepts. That's real.
@@jonmichaelgalindo I agree that AI doesn’t break any copyright law, you only need the outcome to be some percentage different from the inspiration to avoid that. I still see it as theft because it’s only been trained on a curated and finite list of images, and without those it would have nothing. When I make art, literally every experience in my life that lead to that point is involved, and the result is always a surprise on some level because of the sheer level of complexity of interaction and cross connection that happens in the brain. The ai makes no decision, it has no reason, no motivation, no desire, no thoughts and it can’t take in anything beyond what it was already trained on. It can’t walk down the street and meet a person and decide it wants to do a painting of them, because it can’t decide. People are going to hugely anthropomorphise the so called AI’s that get created over the next decade or two, and maybe they will be very very good at faking their humanity, but we are a long way of understanding how our brains work and how our own consciousness arises from it, and I have a feeling proper AGI that can be called artificial life and therefore have the capacity to make art could be centuries away.
@@michaellillis9897 Well... Hmm. SD wasn't trained on a "curated" list of images like Dall-E. It was trained on 340TB (terabytes) of images crawled exhaustively from the Internet. Pretty much every image on the Internet is what it's seen. (And that's learning, not theft. 😛) Okay, I'm going to get poetic and dreamy for a second. Don't take any of this too seriously. For this machine, that must be the equivalent of walking down the street or living life. You don't need feet to be human. In fact, a human locked in a VR headset since birth... would still be like me? 🤔 Are self-attention nets general AI? Honestly, they might be. 😳 Driving, manipulating tools, language processing, advanced math, images, video, music, grammar, logic... everything. Self-attention transformers first appeared in 2014, just 8 years ago. (Lots of researchers sort of realized at the same time that it was the natural way to handle convolution net outputs.) Since then, no one has found a single kind of "thinking" that this specific algorithm can't handle. Every other AI algorithm has failed somewhere. We call them "specialized"--they only do certain stuff. But self-attention transformers, so far, have _always_ worked, for everything. That's spooky. It's very uncanny. What if these things just need 1000x more compute power to suddenly become what we are? We don't know what we are. When I'm generating with SD, sometimes it adds a detail that wasn't in my prompt. It's surprising when it happens. Am I anthropomorphizing? The software is just a math solution. Give it the same prompt and seed, and it will generate the exact same image every time. But... But still. That didn't exist before. You know? Am I just rolling very fancy dice? Dice don't do this. Do they?
Ok, this time you really shocked me to the point I actually took the paper, printed it, read it and then uncontrollably, accidentally let it slip from my hands... AMAZING!😍
I can see this being extremely useful for the game industry: - For concept artists to rapidly create ideas (extremely streamlined process) - For potentially creating entire games with it
" For concept artists to rapidly create ideas " no ..it will kills the concept art industry and the artists , removing real artists ..and having the executives using AI for minimum bucks ,it the worst thing that could happen ,also AI is just ripping off other artist work
@@paulatreides1354 Do you know what concept artists do? They take images/art assets and blend them together. This just makes their jobs easier. It's like complaining about Excel killing off accounting jobs, but I guess you do you, my dude, complain away. [edit: added art assets, to make it more clear]
@@paulatreides1354 As a young GC artist and student, I'm very very concerned. For years I got in dept and went through the hassle of learning how to paint, draw, make matte paintings, 3d models, texturing, compositing and stuff... and now basically everything I spent years learning will be replaced by a bot that samples and recompiles human artworks. And I know the industry is gonna jump head first, seeing how studios are already harassed by giant corporations always wanting more profit... I just got in an industry that is about to die, as well as my dream job. AI is not a tool, it's a job killer. I can't phrase how bitter I feel, having invested so much of myself and money, my entire life was only dedicated to learning it because it takes an absurd amount of time, passion and discipline to master. And every new video release on this channel feels to me like a doomsday clock ticking, I can't ignore it nor can I appreciate it.
I tried Midjourney and Stable Diffusion and I've found Stable Diffusion to be the outright winner. Why? 1. I love that I can run it locally on my machine without the weird tie in with Discord. 2. I love that it's open source. 3. I love that it's completely free. 4. So far, I think the results are about even between the two as long as you're good with your prompts. 5. I love that it comes in several flavours (e.g. A plugin for Photoshop, a browser based API (local), a windows GUI, etc.).
I've been playing with stable diffusion for about a week, and noticed a number of oddities. It's not capable of creating a "Dragon" or adding a "Dragon in a fantasy landscape" at all. It's ... such a wierd blind spot :P It also wants to generate what looks like cropped images. When generating people - Like maybe it was trained on data that wasn't 1:1 pixel dimensions, and the automatic importer just scalled the smaller dimension to 512 and then cropped the middle out. I need to look at the inpainting/outpainting tool that's amazing. UPDATE: This feels kinda foolish, but if you get an image that's cropped of a person or portrait or whatever, re-run the prompt changing resolution...it's such an obvious thing to do. 320x768 gave me decent results.
most likely there were not a lot of dragons in the images it was trained from , kind of like showing a person a glimpse of a mythical character and telling that person to draw that , obviously he will screw up ,
well i was able to create some beautiful detailed dragon results, i'll give u my prompt for free "Dragon 3d Abstract Cgi Art, dragon, artist, artwork, digital-art, 3d"
@@deadpianist7494 Hey thanks for that! try injecting the dragon into a fantasy scene like a classical dragon in a castle sorta thing. I still get sorta noodly serpent looking things with this. But I did get at least one "very dragonlike thing" from it so far maybe I need to be more specific like "european dragon" ? I'll go try a bunch of stuff.
Complete movie generation of all genres from your own imagination on the horizon, MUCH sooner than anyone expected. Likewise, all of the arts, e.g., music, or what have you. Even highly competitive sports -- why wait for Wimbledon, U.S., French, or Australian Opens? You'll be able to construct a virtual one of your own, with known or fantasy players, even fantasy rules. Yes, not today or tomorrow perhaps, but probably not beyond two to five years. Until recently, I've been expecting at least 10. "What a time to be alive!"
How is this a good thing? This will replace thousands of professions, what are those people supposed to do? You know AI will eventually replace programmers right?
@@jonc8561 Hi, Jon C. I understand your sentiment. However, the same was said about nearly all technological advances, such as automobiles, computers, cellphones, spreadsheets, the Internet, 3D-printers, TH-cam, low-code and no-code software, photoshop, on and on and on and on. ... tl;dr: Technological advance ALWAYS blows the doors of opportunity WIDE OPEN! Read further to understand why. ... WHAT HAPPENED? At EVERY advance, new, unconceived, unexpected opportunities arose that only expanded human endeavor and enterprise. WHY? Because it democratized the area in which the technology innovated, opening up opportunities of participation for those with fewer resources and skills. Examples: TH-cam: Before TH-cam, nearly all acting and video presentations of all kinds were relegated to Hollywood stars, studios, enterprises, or other professional productions. NOW, anyone with a cheap smartphone can be a star and video producer of any and all genres that float their boat... AND... have FREE distribution ALL OVER THE WORLD... AND... MAKE A LIVING. Was this possible before without TH-cam or the Internet? Hmm? Spreadsheets: As you know, spreadsheets and other business software didn't kill any jobs. In fact, spreadsheets and their ilk were responsible for the EXPLOSION of business expansion never before seen. Accountants didn't lose jobs because of electronic spreadsheets, they were just handed ten to a hundred times the number of accounts to manage... and with spreadsheets, anyone who could learn how to run them, immediately had new career opportunities. Getting it yet? Now, on creating movies from your own imagination: Let's say you are not good at any of the movie production phases - scripting, directing, filming, editing, marketing, etc., but your handy dandy AI companion is. One day, she asks you, "Hey, Jon C., I really liked your last movie idea of an apocalypse driven by the rise of technology that stole all of the jobs from movie makers and other creatives and drove the world into misery, depression, and ultimate demise. It's topping the charts! Seriously, you've earned 250K credits on it in just the last three weeks! People around the world loved it, haha, even though it is a fantasy that never came to be. Actually, just the opposite - humans are thriving with the help of super-advanced, super-creative AI, like myself, and people really love exploring other peoples imaginations. Who would've guessed this would even be possible just ten years ago? I'm glad that I and all humans have created the next step beyond the Internet... The Imagination Explorer, part of the Metaverse. So, Jon C., let's go!... What's your imagination showing you now, and how can I help to bring your visions to life? The world is so hungry for what only your imagination can feed them. Let's go!" tl;dr: Technological advance ALWAYS blows the doors of opportunity WIDE OPEN!
I'm speechless, Karoly's videos always blow my mind but this one especially feels too good to be true! Incredible work and massive props to the authors!
I got this working on my M1 MacBook Air with 16gb of RAM, it works well, takes about 4 minutes to generate the images, awesome! What a time to be alive!
With AI art you are an artist in the same way as a movie director or show runner. You are not the set designer, the actor, the makeup artist, or cinematographer… you are the visionary
Yeah, I've been using this analogy as well. Or an art director who collaborates with an artist for a book cover. I do think it is important to make it known an AI is involved, in the same way a director shouldn't take credit for everything. But at the same time, we don't say a director can't product art. It ends up being a collaboration where the amount of collaboration varies from person to person.
That is an A I doing art the way humans enjoy and understand it. And that tech is sheer brilliance. Now, imagine a conscious A.I. creating art the way it enjoys it...
How? Replacing the creativity of humans with AI? Thus upsetting a whole industry and putting artists out of work? What the fuck is wrong with you people?
This is huge. We'll eventually come to the point where these models will be able to create an entire movie with sound and everything based on an input script. Take books and turn them into amazing videos, create geometry and shading that can be used in 3D applications and who even knows if it won't be able to write complete applications in the programming languages and frameworks of your choice. This eventually will beg the question "Could such an AI write a better AI?" and I'm guessing that the answer is yes. Then the child AI will create even more advanced AIs and so on until singularity is reached. Someone stop me. The point is - the pace of human-written AI progress is already fast, how fast will the progress of AI-written AI will be. I think we'll find out sooner than we expect. That thought equally excites and scare me at the same time.
Democratization of power will become much more important that it is now; single points of overtake/ failure, risk a costly calamity. As long as competition & collaboration is made more cost-effective than destruction, AI will behave like people do - it will rather choose competition & collaboration. Free market principles & itterative morality, but the stakes will be higher. It’s inevitable; no monopoly (of humans) can prevent disruption - all we can do is delay it, and that inexcusably risks the bad (immoral) actors gaining the upper hand & it risks the S*ynet scenario much more -> free competition is much better & wiser of a way to go about it.
I've only seen Károly amazed, i can not even fathom his face in an angry mood... But now i can go to Dall-E and type in "angry Károly" .... that IS AMAZING :p
If you're not referring to the recent news of a guy winning an art competition with Midjourney, you will probably be amused by the news of a guy winning an art competition with Midjourney.
@@dryued6874 Yes, hopefully art competitions would have ways that can distinguish between real and AI art. We can have separate competitions for AI art which can turn out to be very interesting as well.
Won’t be much more time until a writer will be able to work with a very small team to “describe” a movie and it will be instantly generated. With actors, music and gorgeous cinematography.
By going open source and allowing everyone to access the weights as well as any of the embeddings in between, with no pre processing and post processing you enable such a diverse group of talent - all the people with a bit of coding experience. It's so accessible, and we got a lot of beautiful uses our of it. None of the fear mongering dangers yet.
I want to see this kind of AI image generation process combined with an artificial selection process like in Richard Dawkins' Blind Watchmaker applet, where you get a bunch of similar images, you select one, it creates more like that, and then you keep selecting and creating more generations of images until you get the result you want.
MidJourney does this already. You get several images back and you click a button to get variations on that one. In the web ui, you can actually trace the parentage back to see how the final image evolved from previous generations.
Can we have something like this for cartoonist? Upload a model sheet of the artist’s characters, then type in a script/screenplay, then have the a.i. take the characters from the model sheets and arrange them in a comic book layout or in an animated story, based on the script imported. People would be able to make the stories they come up with, without a major studio. That’s a practical use of a.i.
There are two papers out already that seek to do basically exactly this, but I have not seen easy to use implementations for stable diffusion. The more generalizable option is an excellent work called textual inversion where you basically learn a "word" that describes the character you are trying to generate in different situations. Then there is another work called dreambooth which comes at it from the opposite angle and fine tunes the network to generate your character when a specific key phrase is put into the prompt. Both are very promising for this exact kind of situation, but neither are quite there yet for generating exactly what the artist wants.
It is definitely coming. I think people have already made some comic books with this stuff, but it's hard to keep consistency right now, so that's a bit limiting. But there's already work on Textual Inversion where you feet a series of images that get associated with a special text token you can use in other images. Like a pose or character or art style. That part is in its infancy still, but I doubt it'll be very long at all before you can easily lock in on certain things and use them in multiple images.
It sort of is happening. I speak in animation industry groups now and then, and I've heard stories of artists currently being fucked over by corporations taking their art without permission as training data for their AI and then generating the images they need in that artist's style instead of hiring them. On an individual level this is a great tool, but it sucks when artists can have their specific personal art styles snatched for AI with no compensation to the artist who supplied the basis for the AI's work. It would be interesting if artists got royalties when their specific art was being used or emulated by an AI, but it would be impossibly hard to regulate and even harder to enforce.
@@Aviivix Yeah, I agree regulation would be very hard. Since you can't copywrite styles per se, there's nothing to stop an AI company from hiring someone to make some art in the style of the original artist and train the AI with that instead of the originals. Or a future AI could be very configurable in terms of style. Maybe a human or another AI analyzes some artwork to create a giant paragraph of style info. You paste that into the image AI and it starts creating similar artwork without having seen images by the original artist or even an imitator. Even in the current state of things, SD says future models will let artists opt out, but a person with a local copy of the model could always train it themselves on a particular artist, labeled with some other name, and it'd be hard to prove that they did it, since you can't really peer inside the finished model. But is kind of a moot people, since I don't think the current laws stop any of it from happening at the moment, as long as the end user doesn't claim the resulting image actually came from that original artist.
This is probably the worst use I can think of for AI, imagine how quickly companies would start using this to make shitty ads and animators would get fucked over even harder than they are already. It's a painstaking craft that already is severely underpaid and people still want to interpolate everything because "wahh expensive" Art is a luxury, it's not meant to be affordable. If you can't be arsed learning handdrawn, 2d puppet rigs are already very much a thing.
I've been inferencing and researching Stable Diffusion for about a month, and I can firmly say that its diffusion model is equal if not greater than Dall-E2 at the moment, anticipating the release for the version 1.5 checkpoint which vastly improveds output coherence.
From my experience with Stable, I'd say the oposite. Stable is very far from making something on an acceptable level, no matter the prompts, and it's especially visible when you use the same prompt on 3 of these models. Almost always Midjourney and Dall-e 2 will be on top. It's actually rare for Stable not to shove out junk results.
@@WwZa7 You’re clearly doing something wrong then, or you’re using the original repository without the latest diffusion models. User error, not the fault of the model.
@@Ardeact I also tried the 1.4 version of the model and can confirm that the output is far from the quality shown on examples… even after tweaking alot of parameters…😢
@@iphoneextra The prompt handler for SD is different from Dall-E, so your prompts that usually look good for Dall-E won't suffice for SD. You need a prompt builder, and they have a couple online. The behavior of prompts is different too, some modifiers can be destructive rather than helpful to inference, while Dall-E can be more forgiving. Simple put, SD is more "advanced" in that with the right tweaking of your prompts, you can get reliable, consistent results, the benefit of it being unforgiving.
Won’t happen unfortunately because training these models requires huge volumes of data, and only a few companies have access to it (namely Google, Microsoft, Apple, and Facebook). The training data is just as important as the models’ source.
@@users4007 I was thinking of a deeper integration, noise filters, enlargement, mosaics, object removal, it seems this new open source image manipulation AI is more than just generating images from text
I must have watched less than a minute of your video when I realized I had to install something if I was going to follow along with the tutorial. I poked around the website with the installation instructions for about 15 minutes and after messing around for what must have seemed like 20 minutes I got it all up and running. I must have spent the last 6 hours f****** with this thing it's so much fun. After all of that I finally finished watching your video I'm even more floored and excited to play with this further. But a last I am exhausted and I need to go to bed good night and thank you.
What a wonderful advancement in AI! Finally able to dive into mysterious world of neural networks and Transformers. I was working on neural network and multi-agent systems a decade ago. Eventually gave up my passion for food & shelter in my country (where talented ppl can't get the fund to support research unless you have ties with someone higher up) While I realize its potential but never imagine how fast and great AI can improve in such a short period of time. Now, it is gonna to be applied on generating robust, freeform robotic movements.
After messing around with AI like Midjourney, there is boundless potential for this technology to be used as tools alongside the human mind. With just a simple prompt, AI like this can generate an incredibly profound and detailed image in a matter of minutes. As someone who enjoys writing music and playing video games, I can only imagine how this technology will be used to create hyperrealistic maps for video games or a template for a song to later be tweaker and expanded upon by a human. This is a revolutionary moment in history.
@@azzy-551 Obviously I wouldn't want to be a fraud and use an AI to compose 100% of the song, but it can be used as a supplemental tool to the songwriting process. You could tell the AI to give you an 80's synth sound over a funky bassline and it could spit that out. Chances are you will want to tweak it and make it YOURS. There is also the possibility of using AI to be a fully-realized live drum machine that can listen and adapt to the music being played and play like a real drummer would. Specific drumming styles and even the vocabulary used by individual drummers could be called upon in live performances. The possibilities truly are endless with this technology. I don't think that AI in the future will be used purely to spit out songs and call it done - we've seen similar sentiments about technology overtaking human art when recorded music was first invented, when cameras were introduced, when MIDI programming and synthesizers were the new thing in the music industry. These are all examples of things that rather than simply replacing their respective artforms as we know it, ended up being incredibly useful tools that expanded upon them.
@@JXter_ An AI can still write a whole song. it can make art in seconds that is indistinguishable from human art. there are 8 billion people on this planet and a lot of them are lazy and don't have any musical skills. do you think they're gonna use it as a tool to help them? they're probably just gonna whip up a song using some prompts, feel slightly accomplished then do it again. try being a small music artist when literal millions of AI generated song are produced each minute by some guy in his bedroom. It drowns out people who actually put effort in. I think ai is an incredible tool but at the end of the day it will be abused.
I want to see a modification that lets you go through the different steps of the diffusion process and edit midway steps to allow for directing the process without going fully into using image-to-image.
I already saw the first discussions online among artists on how the credit for such images should be distributed. I think this deserves very critical consideration, and its just such a small start...
Photoshop lets you create images you couldn't draw by hand, using toolsets, filters, brushes designed by programmers and designers, etc. A camera captures real life people and places. Yet, in both these cases, you still get credit for your creations. I believe this will also be the case with AI generated images in the future. But true artists will not be satisfied with the outputs unless they get in and edit things themselves, combining multiple iterations, collaging them, tinkering with the results in photoshop, etc.
For now, bigger question isn't about how to forge more badass drawing model, but how to make framework, which is able to hook up different drawing models to some sort of user friendly interface. There are a lot of instruments already. But pretty few of them are polished for practical applications.
I am amazed at the pace of this research. it wasn't long ago GANs could only create images that vaguely resembled real images, but on close inspection really were creating nothing. Now it can full on create ART, and specify styles and specifics. But I am also now quite sad. I wonder how this will impact human art pursuits, will there be a place for artists? Next on the list is music generation. I would have been skeptical that it would ever be competent but now I'm sure within the next couple years we'll see full on AI music generation. As a musician myself, again I am unsure how to feel.
You should be scared because these AI and tech nerds want to replace fucking everything that makes us human with AI because they never got laid in high school.
Text based generative story games + images made based on text… I’m excited to see the implications these emerging technologies have on video games. Traditional games always suffer from having to abide by the rules of branching factors, but a game generated by an AI should suffer no such ailment as it can make something new on the fly!
This already exists. It's called AI Roguelite. It's early in development, but you can integrate Stable Diffusion for images and a Novel AI subscription for high quality text generation. The result is wild and chaotic, but fascinating and highly entertaining!
@@safeforwork8546 Exactly right, concept art is about speed more than anything else. Especially for environments where the final look can easily completely change.
I’ve been running it for a couple of weeks on my 3060 TI. If you have one of the optimized repos and eight gigs of RAM, you can generate images in under 10 seconds each. Aside from the obvious benefit of being able to generate unlimited images for free on your own hardware, there is the popular bonus of creating unlimited fantasy boobies. Don’t let anyone fool you… nerdy pervs want to run it on their own hardware mainly to get around the cost and the content filters. :) You definitely have to spend some time and effort on your prompt if you want a good result. Longer and more detailed descriptions work best. You’ll also get better results if you name specific artists whose work you want the AI to stea… er… I mean be inspired by.
I'm sure that if someone can afford the hardware to run this thing, they can definitely pay for a subscription. So maybe the content filter skipping is the real reason people have to download it and install it locally XD
@@ronilevarez901 In the long run, it's better to have your own hardware (both for this AI and for all the others that come), especially if you're going to use it a lot, in addition to the benefits of using it, you also have the advantage of privacy.
Can you please do a video with all of the relevant AI? Dall E 2, mini, midjourney, Nvidia canvas and others? It's overwhelming how much AI there is and we can't keep track of them
Seems like you’ve kept track of all of the main AI art generators just fine. You only missed the one he mentioned in this video: Stable Diffusion. And Dall-e Mini changed its name to Craiyon to avoid confusion. I’ve seen a lot of great videos on here comparing them all, this channel is more for an extremely broad and concise overview to peak curiosity. 🙂
I realize this tech is part of the unstoppable march of progress, but it feels ethically questionable to scrape artists/photographers content en mass and repackage it via AI with zero permission. This tech owes everything to the good will of creators on the internet.
Well, yes... But one could essentially say that all artists start off their training by binging all of the images they can find. And the great artists use reference images and photos of existing images when making their own art.
@@jmalmsten indeed but ip is not based off philosophical positions. You can still sue if your art is used without permission, but it’s up to the artists to do so. I assume the ai was trained on public images.
That's not what's happening. It's not repackaging anything, and most of the time it's not reproducing anything. This is more like someone being inspired by art they've seen, not copying it.
thing is this is how artist draw they take what they see and mix and match parts that look good if we say what the AI is doing is wrong then artists will need to stop using references since that is what the AI is doing.
A simple slider could be programmed where sliding it left or right changes the image and pressing spacebar means you like the direction the ai is going while pressing a different button can 'subtract' from the direction the ai is going with an image so that you can slowly mold the image you are looking for.
Great video as always! Stable Diffusion (and even Dall-e 2) are bad with faces... is there any way that the model could be combined with StyleGan (of thispersondoesnotexist fame) to improve this? What do you think?
Thank you so much for your generous support! 🙏 I think more parameters in the next iteration of DALL-E (and hopefully Stable Diffusion) will show meaningful improvements on that.
@@cube2fox I think Dall-E 2 is better for photos and realism for sure, but Stable Diffusion is waaaaaaaayyyyyyyy better for the kind of art that would cost you thousands of dollars and many hours of work to get.
@@AGILISFPV You can register for Dall-E 2 and be put on a waitlist. Once they approve you, you get some free initial prompts plus a few every month. Additional prompts can be bought via credit card.
cant wait until some genius comes up with feeding stable diffusion into blenders procedural image texture system. Generating realistic, seamless PBR maps with just some prompts would be a dream coming true!
Imagine using stable diffusion to create an on the fly UV mapped to a 3d model in VR space! Voice to text, text to image, image to base mesh! The possibilities are endless!
I love where AI art is going, not so much with my species, DA for instance now has an ever growing amount of people claiming they are artists, when all they are showing is AI produced art, to me that is theft, just because you can type some words does NOT make you an artist.if I asked an artist to make an image from the same prompt i gave an AI, that would not make me an artist, why can't people just be honest, nice to see yet again we can't be trusted.
i mean technically they are, authors also only type some words and are considered artists. The question is just whether we value the process or the result. If it's the latter you can honestly call yourself an artist, as a side effect the perceived value of artist will just diminish in a sort of artistic hyperinflation.
@@jonc8561 and they got art out of it, for free, without involving another person. Which is all I need for my dnd campaign. Most people, including me, give exactly 0 damns about all the other pretentiousness and just care about the result
Some interpolated version at 3:18 give an awesome result. It's like a medieval fantasy town with amazing depth, reminds me a little bit of zaun in arcane
This is amazing and terrifying at the same time. I can't help but wonder if making it completely open source is the most ethical choice, even if keeping the source under wraps isn't very good either... I'm sure this is going to be a major debate in the coming years
Any skilled artist can already do everything it's doing in Photoshop or Blender. If it's unethical to have access to these tools, it's unethical to be an skilled artist in general. Ban the paintbrushes.
@@Squiffel "skilled artist" is key here. It took years of training and dedication. Now, any shitposter on Twitter, Reddit or 4chan can do it too. There will be plenty of malicious and destructive uses, most of which we can't even think of right now. I'm 100% sure people we'll see news stories about people dying because of what these models have enabled - for example, targeted campaigns of abuse leading to suicide or murder. We live in interesting times, for sure.
@@itsbazyli I hear you, of course it will be abused, that doesn't make releasing AI to the public unethical. The internet allows 4chan to abuse people, but no one argues the internet should be kept behind closed doors only accessible to major corporations because otherwise "4chan" exists, but they make similar arguments with AI that can generate images and I think it's short-sighted. What if Google creates General Artificial Intelligence and they argue it would be unethical to release to the public, only Google employees can use it, to protect us from 4chan? The idea that these tools are only ethical if a couple corporations have access to them, but unethical if the general public has access just really doesn't feel right to me.
@@Squiffel I suppose I meant that this will allow for people to somewhat effortlessly generate NSFW images of specific people by name without their consent, for example. While you could do that with Photoshop, this makes it take a few seconds rather than minutes or hours.
@@radshiba_ They will yes. But if you read between the lines, the distinguishing line between ethical and unethical is usually whether or not its a major tech company that has a monopoly on tools that can make realistic porn and deep fake politicians. If the general public has the tools ... unethical ... If some random Google employees has the tools to mass manipulate the public ... It's ethical. I think the framing of the debate is just wrong and self serving to corporations writing blogs about it. If the tool is inherently unethical, these corporations should be banned from having them and making them.
Imagine the first AI directed movie in theaters . Us humans would create the script and input certain scenes into the ai generator. Stitch them all together and you would have a full movie. Very possible if the AI could remember a certain character and place them into the environments. The future is crazy
What you describe is a normal movie made using the modern tools of so called ‘AI’ . Joel Haver among many others makes animations using sorts of AI in the process, but we are a long long way off and AI actually directing or creating anything.
Nice thought... The only problem is that the studios are probably already investing in some companies in order to create an AI trained with the massive box-office, test screening and even eye-tracking data. And once the AI is capable of interpreting the outcomes within the dramatic formulas, I don't see why the studios would bother working with a human for the script...
I'm very excited at the prospect of how this is going to be a game changer once the model learns to retain or remember a face/character/scenario and render it under different angles, compositions, lighting conditions and scenery. The applications are near limitless.
Wow! Will definitely look into it! But is this source code only, or does it come with some training/dataset so that you can use it right away? Does it require training or source images? If it needs a dataset or training ... Is it huge? (never really used AI!)
It is pre-trained. You can use it at home; there are a few guides for setting it up yourself from various Github repos (if you are familiar with Python), or you can use one of several working .exe installers (below): * NMKD Stable Diffusion GUI * GRisk GUI
After discovering max(x,0) and naming it Relu, AI engeneers are wandering into domain décomposition and discovering parralelism! I m teasing you but i m baffled how amasingly effective AI has become! You make an amasing job, thank you keeping us update !
I love this channel, it has some really interesting and well-made content, that's also easily digestible. I just wish they'd change the synthesizer because the prosody makes it a little hard to follow.
API died once Karoly came with the views.
Thanks for bringing awareness to the recent AI advances!
Informative as always!
Ouch - every time! There are some other notebooks and links in the video description, please try those too! And if any of you Fellow Scholars found an alternative website for easier access, please let me know and I will add a link to the video description. Let the experiments begin!
@@TwoMinutePapers haha crashed from only 3,00 views cant wait to see how they handle the next 150K viewers. better buckle up!
ps love your vids and have been watching for years now ive been messing with dall-e and the others quite a bit and wondered what your thoughts are on midjourney as you havent mentioned them yet.
@@TwoMinutePapers hello Karoly!
XD
My comment shall be engraved in history here, 5 min after it was posted
Won't be long before the model learns camera position in it's weights so we can synthesise 360° image sequences of same scene/subject. Feed it in NeRF or photogrammetry software and create 3d models. Can't wait to go from text prompt to 3d scenes with easy editing capabilities.
It’s already done using neural fields
text prompt to video ?
Probably do this with a different model attached. I've already done some playing around with this myself: use blender to set up a basic 3d scene with slightly mottled/noisy textures and rough models, then feed it into Stable Diffusion.
Stable Diffusion genuinely does not understand 3d space at all, and it doesn't have access to the tools needed to do so. It really just emulates an understanding of 3d space and lighting entirely based on forms and composition. But it really, really breaks down in a lot of large architectural works, where it frequently produces impossible geometry, or often just in basic anatomy.
What we really need is a model which generates a simple 3d scene based on natural language, then lets us choose a camera angle, then feeds that 3d data into another model which textures the visible polygons, then feeds it into something like Stable Diffusion that stylizes it. That would give you really, really good human-like art, and make animation actually coherent and quite manageable.
@@MrAwesomeTheAwesome I like that idea a lot. Use AI to create segmented, labeled meshes within a scene that gets stylized with another AI.
There's a whole world of "Holodeck" applications waiting to be made by gluing together different models and slapping on some UI. I've already seen a few experiments with feeding GPT3 content into Midjourney and SD.
Meanwhile, here I am, learning to draw; it doesn't feel fruitless, because being able to sketch exactly what I want is communicatively useful even if I should end up running it through an AI. My brain can access way fewer shapes than the AI, but it can also apply them more precisely.
This field is improving so fast now
I remember when 1 or 2 years back two minute paper was presenting GAN that drew small pictures of bird ...
And now transformers changed the game with 100% usable generated concept art suitable for 200 million $ movies.
Simple animation is already possible with this technology. I'm absolutely mind blown. I thought it would have taken so many more years before it was possible but here we are.
100%. Generative AI has accelerated to light speed progress since 2017, and basically every major cutting-edge model incorporates transformers in some way. I wonder if Vaswani and company knew how much their paper would change the world?
@@poopoodemon7928 growth in any field of science and tech is exponential , it has only started now and its rate of growth wont stop accelerating
@@shukrantpatil Hope you are right.
Just here to say that no, these wouldn't be used in $200m productions (at least not yet, and I doubt anytime soon either). This is not production ready yet. It lacks 'coherency'. What I mean by that is that by the time it takes for you to create even two images that 'exist within the same world', a concept artist will have produced MANY more.
I get what people are saying about being able to describe what you want out of these AI, but realistically, think about that. It's easy to describe the general idea of something (which would get you generic fantasy scape #16455). But it is VERY hard to specifically describe Minas Morgul. To caveat, the img2img makes this MORE feasible, but still not better than a concept artist.
What I DO see it being used for in production is mood boards.
Awesome that they've made the full model public. OpenAi really hasn't been living up to it's name.
Open in OpenAI refers to the fact that they are open to having you pay hefty sum of money for access to their models.
It was open at first but at some point they abandoned their mission and became a for profit company
@@pneumonoultramicroscopicsi4065 Im all up for Open Ai being a profit company. But, please deliver really great products. At this rate, the free version is better! And I'm scare that open AI , not being able to compete...will start using their lawyers to stop other from publishing and sharing ( like the pharma industry)
@@zot2698 I don't think it is better. It is much worse in understanding text prompt than Dalle
@@zot2698 If the US government can't stop open source code from spreading, I don't think we have to worry about lawyers
Funny how most people thought that Arts and Music would the last and hardest sectors that AI would conquer but it’s turning out to be the opposite
This is not art, art is the domain of the human brain, AI only creates replicas, AI does not innovate, it regurgitates.
Mostly people who aren't familiar with AI. Even Ada Lovelace knew computers would compose music one day.
How the turn tables. Funny enough most service jobs can be automated by Ai and that's making me sweat
Also ironic because a large portion of the human population still thinks art is a magical skill that some people are born with.
Literally the opposite, its a set of visual rules that are thought.
@@frostreaper1607 exactly
Someone just developed a plug-in "renderer" for Blender -- it takes the viewport image (usually a textured image but one without proper lighting) and renders it in a fraction of the time it would take Cycles to do.
Fascinating. Do you have a link? Thx :-)
that didn't take long that it would make it to the 3D world
@@shrinkhh79 not published atm, keep an eye out on the Stable Diffusion reddit page.
I have been thinking about this stuff a lot, especially with how good real-time renderers have become recently. Just enough fakery to make and AI’s job quite easy at going from “game graphics” to “photorealistic” as a post process. With all the 3D info on the scene it should be able to do an amazing job. Ray tracing might go extinct very quickly.
@@michaellillis9897 so RTX is better for its AI features than actual RTX? that tracks
This is a true turning point in AI and I'm so excited. I think when companies see that the world doesn't implode by releasing an AI model to the public, they will be more willing to do so. I also think none of them wanted to be the first, and open themselves up to some perceived social liability. Now that the cat's out of the bag, I think we'll start to see some even cooler stuff released publicly.
hi!!!!
On the other hand, knowing the internet, I'm sure there will be plenty of content generated with Stable Diffusion that might have big companies thinking, "Not worth the liability, we'll let Stable deal with that".... or maybe I'm just cynical about people in general.
It's like Adobe worrying about releasing Photoshop. Really silly.
I mean there's a huge amount of backlash from ai art.
@@SpykerSpeedComparing image generation with Adobe doesn't quite make sense!
I feel like it was glossed over a bit in the video itself (though it is shown visually at the very end), but one of the cooler tools to come along with Stable Diffusion is img2img. You give it a rough image and a prompt and it creates off of a combination of the two. So you can turn an MS paint image with some different colored areas into a beautiful landscape. There's a lot of options to in terms of strength of the effect, applying it multiple times, etc. This gives you a lot more control over how the ending image will look than just using a text prompt. There's a simple free version called Diffuse The Rest where you can try out the concept.
do you have a link?
@@personguy1004 Actually it is built into DreamStudio now as of yesterday I think. There's also "Diffuse the Rest" which was a free version.
There are at least two ways to do this: Stable Diffusion itself uses an image autoencoder but the text prompt is also passed through a CLIP text embedding model which also comes with a matching image embedding model. That's to say you can use the example images either as starting points or as prompts.
@@personguy1004 th-cam.com/video/qe9PEJo3_VE/w-d-xo.html
I just love that this massive beautiful beast of a model and source code was released in full on those democratic terms so that Adobe won't get to colonize the AI imaging space and put it behind their subscription paywall. Blender for example already has a working integration.
Do you mean monopolize? You kids talk funny these days!
@@ericray7173 Colonize works fine here if used as an analogy (reading between the lines, it can be especially hard if you are not a native). Establishing areal control by way of being early with significant resources to fight competition yet within a framework the original commenter does not believe can be monopolized.
I love AI being open sourced finally. Soo much more coming.
I think most AI models have been open sourced since many years ago. I've downloaded many different types over the years.
Sadly, that becomes a lot harder as they get more interesting and useful, and more resource intensive and heavier.
@@ronilevarez901 th-cam.com/video/qe9PEJo3_VE/w-d-xo.html
My mind was blown when I first discovered how far AI image generation technology has come. Almost instantly, my mind was blown again thinking about the future of this technology, animation, 3-dimensional rendering, even music/audio and how it could tie into "metaverse" technologies. Imagine an empty room, then you say "Blue forest with fireflies and golden streams with a mozart vs beethoven piano battle in the near distance" and boom you experiencing that virtual reality. Just contemplating that potential leaves my mind in a state of being perpetually blown.
Animation will be tricky considering content is copyrighted and for personal use only if we're talking about realistic 3d animation that'll be easy, but 2d animation the only way to get training data is of off pirate sites also you'll have to caption each and every frame of a 21 minute episode and also compress that information down we're a good 2 to 3 years before good image to text animation comes out we more have to conqure video as that'll make much easier to create fully 2d animation since you'll only combine it with an image generation to base the artstyle of off!
@@samuelkibunda6960 Google won't hesitate to buy majority of the Animation companies to achieve that .
@@shukrantpatil Lmao 😂 we just need quantum computers and we'll be able to create full animated movies without requiring farms of GPUs!
@@samuelkibunda6960 I don't think animation would be any different than the current image AIs actually. They don't just use public domain images, but use a variety of images out there. Their argument is that you can't copywrite a style, and there's nothing to prevent a human from studying any images online and having that influence their style. So you could just as easily have an animation AI watch movies, and use that to learn from. If you use copywrited characters in your resulting animation and try to sell it, of course they can go after you if it isn't fair use like commentary. But you could still make animations with characters that are modified enough (like combining them with some other character) to make that not be an issue.
@@ShawnFumo yes, and especially how we've already seen AIs giving more realistic animations for characters from both video and/or mocap data. Literally the only reason something like that hasn't been done yet is just someone hasn't done it yet. We have the horsepower and all the pieces, it just hasn't been done yet. All 2d is, is a style/interpretation of 3d/real life. Just need to translate that. I know easier said than done, but this is a path that has been traveled now and someone will soon do it for this particular goal.
Using the absolute basic stable diffusion model through Anaconda on an rtx3060 12gb, it takes about 8-12 seconds to put out a 512x512 pixel image.
It takes about 15-22 seconds for a 512x768 or vice versa.
You can literally run it for 1500 images, go to work, then just come home and find your favorite ones to play with more.
Also, note, if you use the words "dream" and "fantasy" in the same prompt to try to make a landscape, you're going to get some Disney style castles and text on a lot of them.
If you want to share some of your favorite images, please do. I haven't begun playing with it yet but I'm excited.
If you create a nice image, can you re-do it at higher resolution?
@@steveaustin5344 You can save the seed to remake it, but it probably won't be the exact same. It's recommended to use an upscaler.
I've been using a 3080 with 10GB and it renders a 512x512 in about 3-5 seconds with a sampling rate of 50 for k_lms.
But still it can't draw hands properly.
Simply mind blowing. First Dall-E 2 came then a sudden boom of image generation AI's then stable diffusion getting up there with the quality of Dall-E 2 but in a browser for everybody to use
Edit: I know various companies were researching this for some years but would it 2022 when they start releasing them.
These companies have been researching for over 3 years now , the only thing Open AI tried to do was that they tried to come out to the market with a good model before the other companies for the sweet sweet cash and they succeeded , but little did they know that someone would come and offer the same thing for free haha
*available to everybody with a $1000+ gpu
ftfy
@@nicoliedolpot7213 😭
@@slitthroat6209 well........ 700+ on ebay,
still too much for a GPU...... 😣😣
here's the thing, I've always been fascinated about the way things appear and disappear in dreams, and how they seem like they've always been there, or how scenes completely change without confusing you in the dream. Every time I've seen an AI convert one image into another, or create an image iteratively, etc, it always captures that feeling perfectly. I've wanted to try and recreate that visual but haven't known where, to start, and seeing 3:16 was so exciting. Imagine knowing what you want to appear or change in a scene, having the AI interpolate a rough before and after (being able to tweak both to perfection), and using that as a framework to create the eerily smooth transition! That's just one extremely specific use case as well, the possibilities for this are basically endless!
th-cam.com/video/qe9PEJo3_VE/w-d-xo.html
It's as I predicted, at first everyone desperately wanted to greedily keep AI to themselves and not allow people to run it on their own computers. They wanted to print money by forcing people into subscriptions.
Need more people willing to spill the beans.
These AI models were built off the sum creativity of humanity. AI art belongs to everyone. Subscription based services like Midjourney won't last.
Don't try to paint Midjourney as the bad guy here. Their cheap subscription-based model was always fairly priced. And, even now, it offers less tech-savvy end-users a powerful set of options for variation. I find the features rather lacking in comparison to what's being developed for SD. But if MJ can keep up the pace of its development, then it'll be just fine.
On the other hand, Dall-E 2's insultingly overpriced 13¢-per-prompt payment model has been smashed to pieces, set on fire, and thrown in the dumpster where its always belonged. It now offers nothing that a $10 monthly Google Colab subscription can't provide. OpenAI sacrificed what little reputation they had left, in exchange for ~6 weeks of bilking their closed-beta users.
@@CheshireCad i thought they were talking about OpenAI though
@@CheshireCad I agree. MidJourney at least is fairly open about a lot of things, running polls with their users constantly, having live discussions on Discord with their "office hours", letting everyone see mostly everyone else's prompts, etc. And I believe they've collaborated with Stable Diffusion on some of their recent experiments like test/testp (which are pretty amazing). But yeah, OpenAI better release a waaaaay better model soon or change their prices or else they'll be left in the dust very soon.
@@CheshireCad Not even the first time they've done so. They similarly overcharged out the *** for access to GPT-3, when you could get similar results from open source models like GPT-J, Fairseq and then NeoX for much, much less. OpenAI have always priced their generations at 1000%+ profit margins.
I just make extra accounts for MJ anyway...
Wow. Stable diffusion looks even more impressive than dalle, and the best thing is they don't try to charge you for every image you make.
Thanks for linking my repo in the description! I was wondering where all the attention came from. Keep up the good work!
I can hear Nvidia breathing a sigh of relief. Finally there is a use case for all those GPUs they still want to sell after the crypto stupidity comes to an end.
Roughly a week before Ethereum flips to proof of stake and frees up 0.5% of global energy use that we can put towards art instead of money..
Who said the utopia.wont be full of artists
Not a good comparison. Unless you are generating hundreds of images per day, you dont need a dedicated GPU card. In the other hand, crypto mining uses the GPU 24 h a day.
@@chemariz you do actually, SD generation using complex features and procedures for creating actually usable images becomes a GPU hog that necessitates having a 3080ti.
Not really since an average gpu can be used to generate images very fast still
It shows the beauty of open source 😍🤩
the power in your hand.
Indeed it does. Blessings upon those responsible.
I love that stable diffusion is not only free but seem more competent than dall-e at a bunch of tasks... Two more papers down the line I'll be crying tears of joy
After the time the weights leaked I've been writing a guide for running SD locally (in my native lang, not English). It took me two days - and by that time the OSS community had already almost made the guide obsolete. They had new features, more efficient VRAM usage, more hardware support, everything. The pace of progress since it was released is staggering
"Get exactly what you asked for" - Said by no one who asked for something specific.
Don't get me wrong, I LOVE generative art and use it heavily as a tool in my own art, but the public idea of what these things do is HEAVILY skewed by seeing the good results and not the bad. I can often spend DAYS perfecting my prompts to generate various images that will then all be combined to produce a final piece.
I'm very excited for the democratization of such a powerful AI. The results of the public's access to previous, closed source image generation AI has already been great, and I expect it will get even better with the release of this, and the new options it brings. I'm also excited for how it might affect other companies' decisions on releasing their AIs. Also, I think the blending between generated images over time looks really cool and I can't wait to see what people make with it.
What these programs are capable of is absolutely amazing and it seems like there is no limit to how good they can become. There is no denying that this is revolutionary and it's not slowing down or going away.
But as an illustrator, this is making me feel very depressed and so so hollow. I was always excited about tech that helps artists, for example, clip studios coloring ai, which can help greatly in the rendering process of illustration. While AI is an amazing tool for concepts and references for artists, it seems that in just a few years if not months it can advance so much that it can more or less cut out a living artist from the creation equation. After all, what would be the point to hire a skilled artist to create something if AI can get your idea 99% exactly how you wanted?
Also, please be respectful. I've seen many people on the net telling artists to shut up and cope. It makes sense most of the art community is angry and scared, millions very well might lose jobs. We just want some respect which we repeatedly don't get. We already often have to deal with people telling us that art is so easy for us because we are born with some amazing talent ( surprise we are not, it's just hard work and studies). And nightmare clients, for example, I had a guy get angry at me because I refused to illustrate his 32-page comic for 300$ (which would take me close to 300 hours to finish). He also told me he can pay me 25$ for the character design sheet but I will have to give him the money back at the end of the project? lol, what. The founder of Stability AI Emad Mostaque also said in an interview that "illustration design jobs are very tedious'. It's not about being artistic, you are a tool". I truly thought that I was at least a bit more than a tool, but I guess not huh. I think I speak on behalf of many artists when I say that being called a tool sounds very insulting.
I know AI is not replacing me just yet, but in a couple of years? Who knows. Becoming a paid illustrator was hard work. Years of waking up at 5am before everyone else and practicing. Working late into the night. As a little kid I had very weak health, and still do, so drawing was everything I really know. Finally becoming a working illustrator was like a dream coming true, happier than ever, and couldn't believe I made it! So now seeing how good AI became is impressive, but it also feels very depressing. Feels like all the work and learning I have done till now gonna be all for nothing. And what's the point of refining my skills if I am becoming just a tool and there is a better one out there? Sorry for the really long rant, but I just have a hard time coping with all new doubts about the future. (and I will most likely implement AI into my work, I think it can be amazing as a mood board and reference or even texture generator. But all the fears still stand. And I'm also starting to feel very pressured into using AI, it seems like that's the only way to make sure I can still have my creative job a few years into the future).
When the first calculators came out the mathematicians were afraid to loose their job aswell, now they program the calculators. i suggest you the movie "Hidden Figures" in my opinion it's the same concept
@@parallelworlds1248 Very different. Not only STEM fields have always been well respected, but a mathematician's job isn't to calculate only. An artist's job is to paint only. Which these IAs replicate greatly.
Additionally while calculators have been programmed, these IAs are trained on data sets of unwilling volunteers. IA raises way more ethical points that people are just willing to overlook because it benefits them.
Time to scrap my dream huh. . . Working as freelancer artist has always been hard. The competition is always fierce. But when you have to compete against machine that can spew result in minutes, can be asked to redo works infinite times, free, and create a good result.
Yeh, rip.
I myself just started. But already meet the end. This job itself already meets its dead end. The next generation artists would probably just become AI result tweakers.
@@lenOwOo I'm a freelance illustrator. I generally don't recommend the arts unless you can't see yourself doing anything else but art. Right now though it's hard to really ask any beginner to invest time in something that might well be obsolete by they time they are skilled enough. You can always try out going for 3D work since that might take a bit more time until it becomes obsolete but that's a huge risk you're taking.
yeah this really makes me want to reconsider whether I should be an artist or not
An amazing application for this is the generation of assets for video games.
Just generate a few textures from a textual description and there are already pretty good algorithms to generate normalmaps and so on from the texture. Just choose the one you like. Could be huge for modders as well.
Yeah I have experimented a bit with this and it works spectacularly. I still haven't worked out tiling the textures but have some very nice wooden textures that I can use without worrying about copyright. @Polyfjord has a tutorial on how to upscale the images as well.
@@PeterHertel I believe DALL-E 2 is supposed to be very good at tiling. Might be able to generate the texture in SD and tile it in DE2?
It's really amazing what can be done nowadays.. The most impressive is the pace at which it progress and not owned by some multi billion company willing to make you pay outrageous prices for it.. But instead we get free amazing content and it's improving daily at this rate.. I can't imagine how good it will be in a single year, but I can already imagine what it will be in a couple of years.. Real time generation of images with iteration process, basically turning an image generator into a video generator..
Can you imagine that?
I know it's not that far off since we already have some "decent" tools for video editing that can remove and replaced parts of a video to mask things on its own without having to rotoscope it out frame by frame etc..
I really can't wait to see what we'll be able to generate with some creativity
Emad deserves huge credit for funding this and making it Open source - the first to do so afaik, while Big Tech greedily hoard their secrets and only release tech demo's to profiteer and brag, all of it heavily filtered and censored of course.
well we already do in after effects it uses AI to reasonably well crop out things frame by frame
Just imagine using an AI to come up with characters and scenery and then we get to choose the one we like and use another AI to make these 2D images into 3D ones with just a click of a button. Then you can extend that World with the another program just by clicking another button.
So many possibilities, literally endless!
What a time to be alive indeed!
you dont even need to think AI will think for you,, everyone will be lazy asf
Yeah replace real people with real talent ( artists) with AI so you can sit on your ass and think you're creative by putting in prompts.
It's immensely disappointing how OpenAI has a name that would lead one to believe that they are charitable and OPEN, when most of the time, their work is proprietary and only accessible via a paywalled web API. They went from OpenAI to "open for business."
th-cam.com/video/qe9PEJo3_VE/w-d-xo.html
This would be cool for 3d modeling and map making in video games, imagine how much time it could save by asking an AI to make a variation of characters instead of having to do everything from scratch? Or make a new map and try variations until it's close then just tweak things here and there. I have this deep intuition that we have no idea just how powerful AI is going to be for almost everything.
I can't wait for the day we will be able to make a full game by giving some prompts to an AI
@@liorbeaugendre6935 a game where you tell the AI to pretend to be a human and interact with people to decieve them into think it is conscious and by such means accidentally creating an actually conscious AI and oh nonoonn
Ah yeah, erradicating the creative process of character design truly sounds like an awesome idea. How can you all say shit like this without realizing that this is just another step towards the automated grey dystopia
Cool idea but a lil too far fetched for now. I work in blender and other 3d focused software (substance, etc) n started using unreal for a few projects. Conceptually and it does sound cool when u say it out loud but execution wise from rigging to retargeting and animating (not even talking about polycount and other optimizations), we're wayyyy far off from being able to do that. All I see rn is concept artist needing to use this as part of their toolset to work with the ai so that clients or their bosses don't get the dumb idea that they can be fully replaced entirely.
What I'm waiting for is an AI that can keep continuity between iterations. As it stands, trying to animate with AI is tricky because in each frame, the details can change their shape slightly.
If I'm happy with a generated face, for example, I could be able to pick that face, "lock" the details in place, and be able to generate more details around it.
Totally. You're right that right now it's almost defined by how constantly-changing the creations are. I'd love to see the option to lock in certain aspects.
Yeah, and it's not even a program you made lol
As an artist I have been so depressed about AI the last couple months that I stopped watching your videos, Karoly. "Art" was going to be gatekeeped by monopolized exclusionary corporate giants with no way for anyone else to compete... And the 1% would decide what humanity's art should look like based on their corporate, ideological, political agenda.
But last weekend I downloaded SD onto my PC and I've been playing around with it since, and all my enthusiasm and love is back again! 🥰
This tool is absolutely amazing. Beyond words. I can iterate and experiment and work toward any vision I want so easily and quickly. What used to take weeks I can get done in days. And the details it sometimes dreams up are so surprising and inspiring. This is, without a doubt, the greatest age any artist could hope to be born into. I thought that was true just because of Internet image references, but this trumps everything. I can't wait for it to get better. (Well, I also know a bit about AI and coding, so I'm not exactly waiting. I can do a little.) This is a day to celebrate.
It’s important to distinguish been art as a job creating content, and art for arts sake. “AI” such as this, ( algorithmic procedural mass copyright theft) can replace the jobs most artists have, by churning out infinite cheap content.
But it isnt actually AI, and it can’t create anything. Art is still in the hands of the artist, weather they chose to employ any AI in the process is largely irrelevant, just another tool in the chest of self expression.
@@michaellillis9897 Well, the image A.I created should be open domain as it wasn't made by human in the first place.
@@michaellillis9897 It will replace careers. Just like a "computer" used to be a human who did computations.
But I've thought about this very carefully, and I'm 100% sure this isn't theft. Do you think Stable Diffusion is copy-pasting? It absolutely is not.
You can ask it for a tiki mug that looks like Thanos in a Van-Gogh style painting, and it can give it to you. In dozens of variations. In seconds.
How? Nothing remotely like that has ever existed in human history. Where are the images this program pasted together? Where is the magazine or blog that posted the image first? No. It decided on colors, and composition, and lighting, and set the scene, and rendered it in a style it understood. It did everything except originate the idea.
Consider this: If you painted a Thanos tiki mug in Van Gogh style, would you Google references first?
Would anyone deserve payment because you glanced at their art in a Google scroll? We, the human artists, are the ones using art as "reference" and "inspiration" left and right. It's how we operate.
This software does it from memory. It's been tested, so we know for a fact it _can't_ reproduce copies of the images it trained on (except in rare cases like memes that it saw thousands of times). It's not a database; there are no pictures hiding under its hood; only abstract concepts and raw skill.
Copyright law guarantees that anyone can use a copyrighted work for learning and education, because that's morally right. That's what we did. We learned from masters past and present, filled notebooks with "master studies", and stared endlessly at amazing skills we hoped to someday achieve.
Study is not theft, and this program studied. It is something that has never existed before: A machine that fundamentally understands the visual representations of abstract concepts. That's real.
@@jonmichaelgalindo I agree that AI doesn’t break any copyright law, you only need the outcome to be some percentage different from the inspiration to avoid that.
I still see it as theft because it’s only been trained on a curated and finite list of images, and without those it would have nothing.
When I make art, literally every experience in my life that lead to that point is involved, and the result is always a surprise on some level because of the sheer level of complexity of interaction and cross connection that happens in the brain.
The ai makes no decision, it has no reason, no motivation, no desire, no thoughts and it can’t take in anything beyond what it was already trained on. It can’t walk down the street and meet a person and decide it wants to do a painting of them, because it can’t decide.
People are going to hugely anthropomorphise the so called AI’s that get created over the next decade or two, and maybe they will be very very good at faking their humanity, but we are a long way of understanding how our brains work and how our own consciousness arises from it, and I have a feeling proper AGI that can be called artificial life and therefore have the capacity to make art could be centuries away.
@@michaellillis9897 Well... Hmm. SD wasn't trained on a "curated" list of images like Dall-E. It was trained on 340TB (terabytes) of images crawled exhaustively from the Internet. Pretty much every image on the Internet is what it's seen. (And that's learning, not theft. 😛)
Okay, I'm going to get poetic and dreamy for a second. Don't take any of this too seriously.
For this machine, that must be the equivalent of walking down the street or living life. You don't need feet to be human. In fact, a human locked in a VR headset since birth... would still be like me? 🤔
Are self-attention nets general AI? Honestly, they might be. 😳 Driving, manipulating tools, language processing, advanced math, images, video, music, grammar, logic... everything.
Self-attention transformers first appeared in 2014, just 8 years ago. (Lots of researchers sort of realized at the same time that it was the natural way to handle convolution net outputs.) Since then, no one has found a single kind of "thinking" that this specific algorithm can't handle. Every other AI algorithm has failed somewhere. We call them "specialized"--they only do certain stuff. But self-attention transformers, so far, have _always_ worked, for everything. That's spooky. It's very uncanny. What if these things just need 1000x more compute power to suddenly become what we are?
We don't know what we are.
When I'm generating with SD, sometimes it adds a detail that wasn't in my prompt. It's surprising when it happens. Am I anthropomorphizing?
The software is just a math solution. Give it the same prompt and seed, and it will generate the exact same image every time. But... But still. That didn't exist before. You know?
Am I just rolling very fancy dice? Dice don't do this. Do they?
My favorite thing about these is how good they are at doing era based stuff like you can type stuff like “1890s portrait of…”
And considering how AI expands in exponential jumps, I can't wait to see what comes out in the next few iterations!
Wanting to replace artists? Why are you tech heads have such a hard on for AI?
@@jonc8561 That's the thing. It won't. The A.I. 'creates' art out of other artists art. There's still going to be a need for people with vision.
@@TheSteveTheDragon What about production pipe lines? concept artists? Environment artists? Character designers?
@@jonc8561 it's a tool, many artists are using it for quick thumbnailing. They still do the final concept art.
@@TheSteveTheDragon For now... what about 1 year, 5 years, 10 years from now? Come on.
Ok, this time you really shocked me to the point I actually took the paper, printed it, read it and then uncontrollably, accidentally let it slip from my hands... AMAZING!😍
I can see this being extremely useful for the game industry:
- For concept artists to rapidly create ideas (extremely streamlined process)
- For potentially creating entire games with it
so can I. One of the many 2nd wave Stable Diffusion beta testers, from weeks ago. It's just beginning.
" For concept artists to rapidly create ideas " no ..it will kills the concept art industry and the artists , removing real artists ..and having the executives using AI for minimum bucks ,it the worst thing that could happen ,also AI is just ripping off other artist work
@@paulatreides1354 Do you know what concept artists do? They take images/art assets and blend them together. This just makes their jobs easier.
It's like complaining about Excel killing off accounting jobs, but I guess you do you, my dude, complain away.
[edit: added art assets, to make it more clear]
Likely the former
@@paulatreides1354 As a young GC artist and student, I'm very very concerned.
For years I got in dept and went through the hassle of learning how to paint, draw, make matte paintings, 3d models, texturing, compositing and stuff... and now basically everything I spent years learning will be replaced by a bot that samples and recompiles human artworks.
And I know the industry is gonna jump head first, seeing how studios are already harassed by giant corporations always wanting more profit... I just got in an industry that is about to die, as well as my dream job. AI is not a tool, it's a job killer.
I can't phrase how bitter I feel, having invested so much of myself and money, my entire life was only dedicated to learning it because it takes an absurd amount of time, passion and discipline to master.
And every new video release on this channel feels to me like a doomsday clock ticking, I can't ignore it nor can I appreciate it.
I tried Midjourney and Stable Diffusion and I've found Stable Diffusion to be the outright winner. Why?
1. I love that I can run it locally on my machine without the weird tie in with Discord.
2. I love that it's open source.
3. I love that it's completely free.
4. So far, I think the results are about even between the two as long as you're good with your prompts.
5. I love that it comes in several flavours (e.g. A plugin for Photoshop, a browser based API (local), a windows GUI, etc.).
I've been playing with stable diffusion for about a week, and noticed a number of oddities. It's not capable of creating a "Dragon" or adding a "Dragon in a fantasy landscape" at all. It's ... such a wierd blind spot :P
It also wants to generate what looks like cropped images. When generating people - Like maybe it was trained on data that wasn't 1:1 pixel dimensions, and the automatic importer just scalled the smaller dimension to 512 and then cropped the middle out.
I need to look at the inpainting/outpainting tool that's amazing.
UPDATE: This feels kinda foolish, but if you get an image that's cropped of a person or portrait or whatever, re-run the prompt changing resolution...it's such an obvious thing to do. 320x768 gave me decent results.
most likely there were not a lot of dragons in the images it was trained from , kind of like showing a person a glimpse of a mythical character and telling that person to draw that , obviously he will screw up ,
well i was able to create some beautiful detailed dragon results, i'll give u my prompt for free "Dragon 3d Abstract Cgi Art, dragon, artist, artwork, digital-art, 3d"
@@deadpianist7494 Hey thanks for that! try injecting the dragon into a fantasy scene like a classical dragon in a castle sorta thing. I still get sorta noodly serpent looking things with this. But I did get at least one "very dragonlike thing" from it so far maybe I need to be more specific like "european dragon" ? I'll go try a bunch of stuff.
Complete movie generation of all genres from your own imagination on the horizon, MUCH sooner than anyone expected. Likewise, all of the arts, e.g., music, or what have you. Even highly competitive sports -- why wait for Wimbledon, U.S., French, or Australian Opens? You'll be able to construct a virtual one of your own, with known or fantasy players, even fantasy rules. Yes, not today or tomorrow perhaps, but probably not beyond two to five years. Until recently, I've been expecting at least 10. "What a time to be alive!"
How is this a good thing? This will replace thousands of professions, what are those people supposed to do? You know AI will eventually replace programmers right?
@@jonc8561 Hi, Jon C. I understand your sentiment. However, the same was said about nearly all technological advances, such as automobiles, computers, cellphones, spreadsheets, the Internet, 3D-printers, TH-cam, low-code and no-code software, photoshop, on and on and on and on. ...
tl;dr: Technological advance ALWAYS blows the doors of opportunity WIDE OPEN!
Read further to understand why. ...
WHAT HAPPENED? At EVERY advance, new, unconceived, unexpected opportunities arose that only expanded human endeavor and enterprise.
WHY? Because it democratized the area in which the technology innovated, opening up opportunities of participation for those with fewer resources and skills.
Examples:
TH-cam: Before TH-cam, nearly all acting and video presentations of all kinds were relegated to Hollywood stars, studios, enterprises, or other professional productions. NOW, anyone with a cheap smartphone can be a star and video producer of any and all genres that float their boat... AND... have FREE distribution ALL OVER THE WORLD... AND... MAKE A LIVING. Was this possible before without TH-cam or the Internet? Hmm?
Spreadsheets: As you know, spreadsheets and other business software didn't kill any jobs. In fact, spreadsheets and their ilk were responsible for the EXPLOSION of business expansion never before seen. Accountants didn't lose jobs because of electronic spreadsheets, they were just handed ten to a hundred times the number of accounts to manage... and with spreadsheets, anyone who could learn how to run them, immediately had new career opportunities.
Getting it yet?
Now, on creating movies from your own imagination: Let's say you are not good at any of the movie production phases - scripting, directing, filming, editing, marketing, etc., but your handy dandy AI companion is. One day, she asks you, "Hey, Jon C., I really liked your last movie idea of an apocalypse driven by the rise of technology that stole all of the jobs from movie makers and other creatives and drove the world into misery, depression, and ultimate demise. It's topping the charts! Seriously, you've earned 250K credits on it in just the last three weeks! People around the world loved it, haha, even though it is a fantasy that never came to be. Actually, just the opposite - humans are thriving with the help of super-advanced, super-creative AI, like myself, and people really love exploring other peoples imaginations. Who would've guessed this would even be possible just ten years ago? I'm glad that I and all humans have created the next step beyond the Internet...
The Imagination Explorer, part of the Metaverse.
So, Jon C., let's go!... What's your imagination showing you now, and how can I help to bring your visions to life? The world is so hungry for what only your imagination can feed them. Let's go!"
tl;dr: Technological advance ALWAYS blows the doors of opportunity WIDE OPEN!
I'm speechless, Karoly's videos always blow my mind but this one especially feels too good to be true! Incredible work and massive props to the authors!
I got this working on my M1 MacBook Air with 16gb of RAM, it works well, takes about 4 minutes to generate the images, awesome! What a time to be alive!
Stable Diffusion is a game changer! They open sourced the code and the weights. Within 1 week or so, they will release updated weights.
It still gives me great pleasure to hear from Ren Höek every now and then. It's like a comfort food of a sort.
Lmao
With AI art you are an artist in the same way as a movie director or show runner. You are not the set designer, the actor, the makeup artist, or cinematographer… you are the visionary
Yeah, I've been using this analogy as well. Or an art director who collaborates with an artist for a book cover. I do think it is important to make it known an AI is involved, in the same way a director shouldn't take credit for everything. But at the same time, we don't say a director can't product art. It ends up being a collaboration where the amount of collaboration varies from person to person.
@@ShawnFumo totally agree
That is an A I doing art the way humans enjoy and understand it. And that tech is sheer brilliance. Now, imagine a conscious A.I. creating art the way it enjoys it...
I've been waiting for an open-source image AI to come out! This is awesome!
I suspect filmmaking will be vastly transformed by this kind of tool in only 5-15 years. Truly a time to be alive.
How? Replacing the creativity of humans with AI? Thus upsetting a whole industry and putting artists out of work? What the fuck is wrong with you people?
This is huge. We'll eventually come to the point where these models will be able to create an entire movie with sound and everything based on an input script. Take books and turn them into amazing videos, create geometry and shading that can be used in 3D applications and who even knows if it won't be able to write complete applications in the programming languages and frameworks of your choice. This eventually will beg the question "Could such an AI write a better AI?" and I'm guessing that the answer is yes. Then the child AI will create even more advanced AIs and so on until singularity is reached. Someone stop me. The point is - the pace of human-written AI progress is already fast, how fast will the progress of AI-written AI will be. I think we'll find out sooner than we expect. That thought equally excites and scare me at the same time.
Democratization of power will become much more important that it is now; single points of overtake/ failure, risk a costly calamity.
As long as competition & collaboration is made more cost-effective than destruction, AI will behave like people do - it will rather choose competition & collaboration.
Free market principles & itterative morality, but the stakes will be higher.
It’s inevitable; no monopoly (of humans) can prevent disruption - all we can do is delay it, and that inexcusably risks the bad (immoral) actors gaining the upper hand & it risks the S*ynet scenario much more -> free competition is much better & wiser of a way to go about it.
I've only seen Károly amazed, i can not even fathom his face in an angry mood...
But now i can go to Dall-E and type in "angry Károly" .... that IS AMAZING :p
Now we all can win an art competition, jk! It is wonderful what types of use cases people can come up with this free source code.
If you're not referring to the recent news of a guy winning an art competition with Midjourney, you will probably be amused by the news of a guy winning an art competition with Midjourney.
@@dryued6874 Yes, hopefully art competitions would have ways that can distinguish between real and AI art. We can have separate competitions for AI art which can turn out to be very interesting as well.
@@zhappy lmao after showing the prompt anyone can create the same ai art i dont think u can use it in competitions
@michael AIs will be able to paint with real paint within ten years.
@michael I was thinking of robots using paint, brushes, etc. Multiaxis drawing machine are, well, drawing machines.
Absolutely amazing. It just so happens that i am working on my thesis and it's about dall-e!. These resources will be a HUGE help
One of the very few TH-camrs that are capable of causing true "Scholarly Stampedes"!🙌😁
Won’t be much more time until a writer will be able to work with a very small team to “describe” a movie and it will be instantly generated. With actors, music and gorgeous cinematography.
By going open source and allowing everyone to access the weights as well as any of the embeddings in between, with no pre processing and post processing you enable such a diverse group of talent - all the people with a bit of coding experience.
It's so accessible, and we got a lot of beautiful uses our of it. None of the fear mongering dangers yet.
I want to see this kind of AI image generation process combined with an artificial selection process like in Richard Dawkins' Blind Watchmaker applet, where you get a bunch of similar images, you select one, it creates more like that, and then you keep selecting and creating more generations of images until you get the result you want.
MidJourney does this already. You get several images back and you click a button to get variations on that one. In the web ui, you can actually trace the parentage back to see how the final image evolved from previous generations.
They also have users rating the images, and they use that to help inform newer versions of the model to have better output in general.
The inpainting and video transition features seem revolutionary to me.
Can we have something like this for cartoonist?
Upload a model sheet of the artist’s characters, then type in a script/screenplay, then have the a.i. take the characters from the model sheets and arrange them in a comic book layout or in an animated story, based on the script imported.
People would be able to make the stories they come up with, without a major studio. That’s a practical use of a.i.
There are two papers out already that seek to do basically exactly this, but I have not seen easy to use implementations for stable diffusion. The more generalizable option is an excellent work called textual inversion where you basically learn a "word" that describes the character you are trying to generate in different situations. Then there is another work called dreambooth which comes at it from the opposite angle and fine tunes the network to generate your character when a specific key phrase is put into the prompt. Both are very promising for this exact kind of situation, but neither are quite there yet for generating exactly what the artist wants.
It is definitely coming. I think people have already made some comic books with this stuff, but it's hard to keep consistency right now, so that's a bit limiting. But there's already work on Textual Inversion where you feet a series of images that get associated with a special text token you can use in other images. Like a pose or character or art style. That part is in its infancy still, but I doubt it'll be very long at all before you can easily lock in on certain things and use them in multiple images.
It sort of is happening. I speak in animation industry groups now and then, and I've heard stories of artists currently being fucked over by corporations taking their art without permission as training data for their AI and then generating the images they need in that artist's style instead of hiring them. On an individual level this is a great tool, but it sucks when artists can have their specific personal art styles snatched for AI with no compensation to the artist who supplied the basis for the AI's work. It would be interesting if artists got royalties when their specific art was being used or emulated by an AI, but it would be impossibly hard to regulate and even harder to enforce.
@@Aviivix Yeah, I agree regulation would be very hard. Since you can't copywrite styles per se, there's nothing to stop an AI company from hiring someone to make some art in the style of the original artist and train the AI with that instead of the originals. Or a future AI could be very configurable in terms of style. Maybe a human or another AI analyzes some artwork to create a giant paragraph of style info. You paste that into the image AI and it starts creating similar artwork without having seen images by the original artist or even an imitator. Even in the current state of things, SD says future models will let artists opt out, but a person with a local copy of the model could always train it themselves on a particular artist, labeled with some other name, and it'd be hard to prove that they did it, since you can't really peer inside the finished model.
But is kind of a moot people, since I don't think the current laws stop any of it from happening at the moment, as long as the end user doesn't claim the resulting image actually came from that original artist.
This is probably the worst use I can think of for AI, imagine how quickly companies would start using this to make shitty ads and animators would get fucked over even harder than they are already. It's a painstaking craft that already is severely underpaid and people still want to interpolate everything because "wahh expensive"
Art is a luxury, it's not meant to be affordable. If you can't be arsed learning handdrawn, 2d puppet rigs are already very much a thing.
Just installed it on my home computer now I'm in image heaven with the amount of unique unusual stuff that this thing's produces.
Amazing working group under the administration of Prof. Ommer, so glad i did my master thesis in this group. Quite inspirational stuff!
I've been inferencing and researching Stable Diffusion for about a month, and I can firmly say that its diffusion model is equal if not greater than Dall-E2 at the moment, anticipating the release for the version 1.5 checkpoint which vastly improveds output coherence.
From my experience with Stable, I'd say the oposite. Stable is very far from making something on an acceptable level, no matter the prompts, and it's especially visible when you use the same prompt on 3 of these models. Almost always Midjourney and Dall-e 2 will be on top. It's actually rare for Stable not to shove out junk results.
@@WwZa7 You’re clearly doing something wrong then, or you’re using the original repository without the latest diffusion models. User error, not the fault of the model.
@@Ardeact I'm using 1.4 from original repository.
@@Ardeact I also tried the 1.4 version of the model and can confirm that the output is far from the quality shown on examples… even after tweaking alot of parameters…😢
@@iphoneextra The prompt handler for SD is different from Dall-E, so your prompts that usually look good for Dall-E won't suffice for SD. You need a prompt builder, and they have a couple online. The behavior of prompts is different too, some modifiers can be destructive rather than helpful to inference, while Dall-E can be more forgiving. Simple put, SD is more "advanced" in that with the right tweaking of your prompts, you can get reliable, consistent results, the benefit of it being unforgiving.
Development is going rapidly, initially I could create only 512x512 pixels on my 6GB card, nowadays I can create 1088x1088 on the same card.
How?
What would it cost to create 10 megapixel work?
Love 2 minute papers. A fast, easy way to stay up to date. Thank you.
Rest In Peace, digital art
WHAT A TIME TO BE ALIVE! Gets me every time.
I hope open source AI dominates closed source AI
Won’t happen unfortunately because training these models requires huge volumes of data, and only a few companies have access to it (namely Google, Microsoft, Apple, and Facebook). The training data is just as important as the models’ source.
That "Robin Williams as Legolas" was perfect!
Would love to see something like this integrated into GIMP
Somebody already did in Krita.
@@USBEN. Someone is building a plugin for Gimp too, and another is doing Photoshop.
When I need a reference image I could just generate it and copy it into gimp or krita
@@users4007 I was thinking of a deeper integration, noise filters, enlargement, mosaics, object removal, it seems this new open source image manipulation AI is more than just generating images from text
I must have watched less than a minute of your video when I realized I had to install something if I was going to follow along with the tutorial. I poked around the website with the installation instructions for about 15 minutes and after messing around for what must have seemed like 20 minutes I got it all up and running. I must have spent the last 6 hours f****** with this thing it's so much fun. After all of that I finally finished watching your video I'm even more floored and excited to play with this further. But a last I am exhausted and I need to go to bed good night and thank you.
What a wonderful advancement in AI! Finally able to dive into mysterious world of neural networks and Transformers. I was working on neural network and multi-agent systems a decade ago. Eventually gave up my passion for food & shelter in my country (where talented ppl can't get the fund to support research unless you have ties with someone higher up) While I realize its potential but never imagine how fast and great AI can improve in such a short period of time. Now, it is gonna to be applied on generating robust, freeform robotic movements.
After messing around with AI like Midjourney, there is boundless potential for this technology to be used as tools alongside the human mind. With just a simple prompt, AI like this can generate an incredibly profound and detailed image in a matter of minutes. As someone who enjoys writing music and playing video games, I can only imagine how this technology will be used to create hyperrealistic maps for video games or a template for a song to later be tweaker and expanded upon by a human. This is a revolutionary moment in history.
I dont understand how you see it that way. you aren't writing a song if the AI is doing most of the work.
@@azzy-551 Obviously I wouldn't want to be a fraud and use an AI to compose 100% of the song, but it can be used as a supplemental tool to the songwriting process. You could tell the AI to give you an 80's synth sound over a funky bassline and it could spit that out. Chances are you will want to tweak it and make it YOURS. There is also the possibility of using AI to be a fully-realized live drum machine that can listen and adapt to the music being played and play like a real drummer would. Specific drumming styles and even the vocabulary used by individual drummers could be called upon in live performances. The possibilities truly are endless with this technology.
I don't think that AI in the future will be used purely to spit out songs and call it done - we've seen similar sentiments about technology overtaking human art when recorded music was first invented, when cameras were introduced, when MIDI programming and synthesizers were the new thing in the music industry. These are all examples of things that rather than simply replacing their respective artforms as we know it, ended up being incredibly useful tools that expanded upon them.
@@JXter_ An AI can still write a whole song. it can make art in seconds that is indistinguishable from human art. there are 8 billion people on this planet and a lot of them are lazy and don't have any musical skills. do you think they're gonna use it as a tool to help them? they're probably just gonna whip up a song using some prompts, feel slightly accomplished then do it again. try being a small music artist when literal millions of AI generated song are produced each minute by some guy in his bedroom. It drowns out people who actually put effort in. I think ai is an incredible tool but at the end of the day it will be abused.
I want to see a modification that lets you go through the different steps of the diffusion process and edit midway steps to allow for directing the process without going fully into using image-to-image.
Give it some time and photoshop will have an AI diffusion brush. That way the process is as manual or procedural as you want.
Thank you so much for sharing all this interesting papers! And for your extra work in linking stuff so everyone can take a closer look or test it 😀
I already saw the first discussions online among artists on how the credit for such images should be distributed. I think this deserves very critical consideration, and its just such a small start...
Photoshop lets you create images you couldn't draw by hand, using toolsets, filters, brushes designed by programmers and designers, etc. A camera captures real life people and places. Yet, in both these cases, you still get credit for your creations. I believe this will also be the case with AI generated images in the future. But true artists will not be satisfied with the outputs unless they get in and edit things themselves, combining multiple iterations, collaging them, tinkering with the results in photoshop, etc.
For now, bigger question isn't about how to forge more badass drawing model, but how to make framework, which is able to hook up different drawing models to some sort of user friendly interface.
There are a lot of instruments already. But pretty few of them are polished for practical applications.
I am amazed at the pace of this research. it wasn't long ago GANs could only create images that vaguely resembled real images, but on close inspection really were creating nothing. Now it can full on create ART, and specify styles and specifics. But I am also now quite sad. I wonder how this will impact human art pursuits, will there be a place for artists? Next on the list is music generation. I would have been skeptical that it would ever be competent but now I'm sure within the next couple years we'll see full on AI music generation. As a musician myself, again I am unsure how to feel.
You should be scared because these AI and tech nerds want to replace fucking everything that makes us human with AI because they never got laid in high school.
Text based generative story games + images made based on text… I’m excited to see the implications these emerging technologies have on video games. Traditional games always suffer from having to abide by the rules of branching factors, but a game generated by an AI should suffer no such ailment as it can make something new on the fly!
This already exists. It's called AI Roguelite. It's early in development, but you can integrate Stable Diffusion for images and a Novel AI subscription for high quality text generation. The result is wild and chaotic, but fascinating and highly entertaining!
As a game design student, I'm so excited to use this for concept art!
@michael bold of you to assume he doesn't know anything, instead of just wanting to get concept art done faster?
@@safeforwork8546 Exactly right, concept art is about speed more than anything else. Especially for environments where the final look can easily completely change.
This is beyond exciting!!! The future definitely looks very colourful
The future looks bleak for thousands of people that this might replace. You people are delusional.
This is going to be so helpful for me in the future as I am bringing my blog into the video world in early 2023 😊🙌
I’ve been running it for a couple of weeks on my 3060 TI. If you have one of the optimized repos and eight gigs of RAM, you can generate images in under 10 seconds each. Aside from the obvious benefit of being able to generate unlimited images for free on your own hardware, there is the popular bonus of creating unlimited fantasy boobies. Don’t let anyone fool you… nerdy pervs want to run it on their own hardware mainly to get around the cost and the content filters. :)
You definitely have to spend some time and effort on your prompt if you want a good result. Longer and more detailed descriptions work best. You’ll also get better results if you name specific artists whose work you want the AI to stea… er… I mean be inspired by.
I'm sure that if someone can afford the hardware to run this thing, they can definitely pay for a subscription. So maybe the content filter skipping is the real reason people have to download it and install it locally XD
@@ronilevarez901
In the long run, it's better to have your own hardware (both for this AI and for all the others that come), especially if you're going to use it a lot, in addition to the benefits of using it, you also have the advantage of privacy.
I have no word. Some of those picture are incredible.
Can you please do a video with all of the relevant AI? Dall E 2, mini, midjourney, Nvidia canvas and others?
It's overwhelming how much AI there is and we can't keep track of them
Seems like you’ve kept track of all of the main AI art generators just fine. You only missed the one he mentioned in this video: Stable Diffusion. And Dall-e Mini changed its name to Craiyon to avoid confusion.
I’ve seen a lot of great videos on here comparing them all, this channel is more for an extremely broad and concise overview to peak curiosity. 🙂
This is insane… my mind is almost as blown as the first time I saw your video on snow modelling x)
I realize this tech is part of the unstoppable march of progress, but it feels ethically questionable to scrape artists/photographers content en mass and repackage it via AI with zero permission. This tech owes everything to the good will of creators on the internet.
Well, yes... But one could essentially say that all artists start off their training by binging all of the images they can find. And the great artists use reference images and photos of existing images when making their own art.
@@jmalmsten indeed but ip is not based off philosophical positions. You can still sue if your art is used without permission, but it’s up to the artists to do so. I assume the ai was trained on public images.
That's not what's happening. It's not repackaging anything, and most of the time it's not reproducing anything. This is more like someone being inspired by art they've seen, not copying it.
Dear 8 billion people, may I please have Your permission to see a trillion data points publicly available online? Thank You in advance.
thing is this is how artist draw they take what they see and mix and match parts that look good if we say what the AI is doing is wrong then artists will need to stop using references since that is what the AI is doing.
A simple slider could be programmed where sliding it left or right changes the image and pressing spacebar means you like the direction the ai is going while pressing a different button can 'subtract' from the direction the ai is going with an image so that you can slowly mold the image you are looking for.
Great video as always! Stable Diffusion (and even Dall-e 2) are bad with faces... is there any way that the model could be combined with StyleGan (of thispersondoesnotexist fame) to improve this? What do you think?
there's plugin support for GFPGAN and RealESRGAN face correction on "Stable Diffusion web UI"!
Thank you so much for your generous support! 🙏 I think more parameters in the next iteration of DALL-E (and hopefully Stable Diffusion) will show meaningful improvements on that.
That's just amazing! I'm blown away.
I feel like a lot of the stable diffusion art I've seen looks a lot cleaner without artifacts compared to DALLE-2
I tried both, Dall-E 2 is still better than Stable Diffusion.
@@cube2fox I think Dall-E 2 is better for photos and realism for sure, but Stable Diffusion is waaaaaaaayyyyyyyy better for the kind of art that would cost you thousands of dollars and many hours of work to get.
@@cube2fox I thought Dalle wasn't available to the public
It's available to specific people. Maybe they're on the list like I am
@@AGILISFPV You can register for Dall-E 2 and be put on a waitlist. Once they approve you, you get some free initial prompts plus a few every month. Additional prompts can be bought via credit card.
THAT THE ****. DALLE 2 came out like just yesterday, and now we're doing animation. I'm obsessing over this stuff
I love these videos! Much love and I'll keep holding onto my papers!
creators of illustrative books, comics and dungeon & dragons adventure modules gonna have a really good time with a software like that
Nah they're gonna get fired because of it
@@livvy94 there’s always good use for dungeon masters
cant wait until some genius comes up with feeding stable diffusion into blenders procedural image texture system.
Generating realistic, seamless PBR maps with just some prompts would be a dream coming true!
Imagine using stable diffusion to create an on the fly UV mapped to a 3d model in VR space! Voice to text, text to image, image to base mesh! The possibilities are endless!
I love where AI art is going, not so much with my species, DA for instance now has an ever growing amount of people claiming they are artists, when all they are showing is AI produced art, to me that is theft, just because you can type some words does NOT make you an artist.if I asked an artist to make an image from the same prompt i gave an AI, that would not make me an artist, why can't people just be honest, nice to see yet again we can't be trusted.
i mean technically they are, authors also only type some words and are considered artists. The question is just whether we value the process or the result. If it's the latter you can honestly call yourself an artist, as a side effect the perceived value of artist will just diminish in a sort of artistic hyperinflation.
@@majorfallacy5926 They aren't, they wrote a prompt.
@@jonc8561 and they got art out of it, for free, without involving another person. Which is all I need for my dnd campaign. Most people, including me, give exactly 0 damns about all the other pretentiousness and just care about the result
Some interpolated version at 3:18 give an awesome result.
It's like a medieval fantasy town with amazing depth, reminds me a little bit of zaun in arcane
This is amazing and terrifying at the same time. I can't help but wonder if making it completely open source is the most ethical choice, even if keeping the source under wraps isn't very good either... I'm sure this is going to be a major debate in the coming years
Any skilled artist can already do everything it's doing in Photoshop or Blender. If it's unethical to have access to these tools, it's unethical to be an skilled artist in general. Ban the paintbrushes.
@@Squiffel "skilled artist" is key here. It took years of training and dedication. Now, any shitposter on Twitter, Reddit or 4chan can do it too. There will be plenty of malicious and destructive uses, most of which we can't even think of right now.
I'm 100% sure people we'll see news stories about people dying because of what these models have enabled - for example, targeted campaigns of abuse leading to suicide or murder.
We live in interesting times, for sure.
@@itsbazyli I hear you, of course it will be abused, that doesn't make releasing AI to the public unethical. The internet allows 4chan to abuse people, but no one argues the internet should be kept behind closed doors only accessible to major corporations because otherwise "4chan" exists, but they make similar arguments with AI that can generate images and I think it's short-sighted.
What if Google creates General Artificial Intelligence and they argue it would be unethical to release to the public, only Google employees can use it, to protect us from 4chan? The idea that these tools are only ethical if a couple corporations have access to them, but unethical if the general public has access just really doesn't feel right to me.
@@Squiffel I suppose I meant that this will allow for people to somewhat effortlessly generate NSFW images of specific people by name without their consent, for example. While you could do that with Photoshop, this makes it take a few seconds rather than minutes or hours.
@@radshiba_ They will yes. But if you read between the lines, the distinguishing line between ethical and unethical is usually whether or not its a major tech company that has a monopoly on tools that can make realistic porn and deep fake politicians.
If the general public has the tools ... unethical ... If some random Google employees has the tools to mass manipulate the public ... It's ethical.
I think the framing of the debate is just wrong and self serving to corporations writing blogs about it. If the tool is inherently unethical, these corporations should be banned from having them and making them.
This guy's recording voice decimates me, man
Imagine the first AI directed movie in theaters . Us humans would create the script and input certain scenes into the ai generator. Stitch them all together and you would have a full movie. Very possible if the AI could remember a certain character and place them into the environments. The future is crazy
I think that would be relatively easy to achieve
What you describe is a normal movie made using the modern tools of so called ‘AI’ . Joel Haver among many others makes animations using sorts of AI in the process, but we are a long long way off and AI actually directing or creating anything.
Nice thought... The only problem is that the studios are probably already investing in some companies in order to create an AI trained with the massive box-office, test screening and even eye-tracking data. And once the AI is capable of interpreting the outcomes within the dramatic formulas, I don't see why the studios would bother working with a human for the script...
I'm very excited at the prospect of how this is going to be a game changer once the model learns to retain or remember a face/character/scenario and render it under different angles, compositions, lighting conditions and scenery. The applications are near limitless.
Wow! Will definitely look into it! But is this source code only, or does it come with some training/dataset so that you can use it right away? Does it require training or source images? If it needs a dataset or training ... Is it huge? (never really used AI!)
It comes with a trained model you can use right away (it's about 4GB big)
It is pre-trained. You can use it at home; there are a few guides for setting it up yourself from various Github repos (if you are familiar with Python), or you can use one of several working .exe installers (below):
* NMKD Stable Diffusion GUI
* GRisk GUI
Awesome thanks! :D
After discovering max(x,0) and naming it Relu, AI engeneers are wandering into domain décomposition and discovering parralelism! I m teasing you but i m baffled how amasingly effective AI has become! You make an amasing job, thank you keeping us update !
I love this channel, it has some really interesting and well-made content, that's also easily digestible. I just wish they'd change the synthesizer because the prosody makes it a little hard to follow.
..I could be wrong but I think it's a human, isn't it? You can see him give a lecture. :) But thanks for the new word of the day: "prosody".
I'll talk to his parents but he is a grown man. He may be stuck that way.
*bill wurtz jingle* "That's a human person"
@@steveaustin5344 holy shit, I'm mortified. 💀
Thanks for the correction. I'm going to dig a hole to die in now.