Love the videos, with some of the best content and resources out there! What I do find interesting is I noticed that each application varies in terms of quality depending on the time of day. I cannot speak for every application but have noticed such for mid journey, and Leonardo . For example I have created over 1,000 images in both over the course of hours using the same prompt (not regenerate) and noticed as time progressed images would go from worse to really good. I decided to test such only after noticing that when I created great images from a specific prompt a day before and used them the next day they would render rubbish. Adding how the more coins and time spent the better the results would be. I am not sure if anyone one else had the same experience. Where certain time of day, length of time and so forth tend to result in higher quality images being produced. But great video as I love the content you put out each week! Keep them coming!
The new Leonardo Diffusion XL just came out yesterday (probably after you recorded this video). It is actually based on SDXL 0.9 model unlike the Dreamshaper V7 you used that is based on SD 1.5. Would have been great to see how an actual SDXL based model would have faired against the other image generators. Hoping to see a review of the new Leonardo SDXL model soon from your channel. Keep up the good work. We appreciate what you do!
I thought a similar thing. SD and SDXL have so many models it's not a true representation of what it does. Especially Dreamshaper. It's really good, but is an old model. (I don't have anything against older models. RevAnimated is one of my favs) There are some crazy good ones out there, and they all have their own style. I prefer automatic1111 but Leo is good. Playground ai actually has more SDXL models than Leo
The reason I love gpt dall E 3 is because it can create images difficult to describe, like if you get it to give you an entire lab report and then just ask it to generate a realistic image of the lab setup, it can do it
I think this is very underrated in this review. So far it seems the Dalle-3 ability to handle complex images is far superior to the others which seem to struggle with multiple subjects.
27:25 with the prompt like [Create a seamlessly tileable texture ofa circuit, ensuring that there are no visible edges or seams when repeated.], Dall E-3 does it
🎯 Key Takeaways for quick navigation: 00:00 📸 AI Image Generators Overview - Introduction to various AI image generators and the need to choose the best one for specific use cases. 01:08 🔍 Accuracy Assessment - Testing accuracy by providing prompts and evaluating how well each AI image generator adheres to them. 04:09 🎨 Creativity Evaluation - Assessing the creativity of AI image generators by giving them minimal context and examining the resulting images. 08:37 🌟 Realism Analysis - Analyzing the realism of AI-generated images using a specific prompt and evaluating how convincingly they depict the scenario. 18:50 🎨 AI Image Realism Test - Mid Journey raw, Firefly 2, and Mid Journey without raw scored the highest in realism. - Google and idiogram performed poorly in creating realistic images. 19:59 🖌️ AI Image Illustrations Test - Mid Journey in nii mode excelled in creating colorful and contrasty illustrations. - Style raw mode wasn't suitable for illustrations. - Dolly 3, Bing Image Creator, Leonardo, and Firefly 2 produced decent illustrations, but lacked the contrast of Mid Journey. 26:13 🏞️ AI Image Tiling Test - Mid Journey, both regular and raw, successfully created tileable images. - Dolly 3, Bing Image Creator, Leonardo, Firefly 2, Google, and idiogram struggled to create seamless tileable images. 29:42 📝 AI Text in Image Test - Mid Journey and Mid Journey raw failed to generate accurate text in images. - Dolly 3 produced some images with text, but with typos. - Bing Image Creator also generated text, but with typos. - Leonardo and Firefly 2 didn't perform well in generating text. - Google and idiogram successfully added text to images, but Google struggled with multiple words. 33:53 🚫 AI Censorship Test - Mid Journey and Mid Journey raw had minimal censorship, allowing celebrity and IP-related content. - Dolly 3 restricted some content based on policies. - Bing Image Creator had increased censorship, blocking certain prompts. - Leonardo showed no censorship and allowed various content. - Firefly 2 had significant censorship, blocking both Tom Hanks and SpongeBob with Super Mario. - Google had censorship issues, blocking Tom Hanks and Super Mario-related prompts. - Idiogram had some censorship but allowed certain content. 37:07 🖼️ Image Generation and Censorship - Dolly, Firefly, and Leonardo can generate various images effectively. - Idiogram appears uncensored but lacks in image quality. - Mid Journey and Google have usability and censorship issues. 38:32 🤖 Usability Evaluation - Mid Journey's Discord interface can be overwhelming. - Dolly 3 in Chat GPT offers a familiar chat interface. - Leonardo provides versatility with a user-friendly interface. 42:17 💰 Pricing Comparison - Mid Journey offers a $10 monthly plan with limitations. - Dolly 3 in Chat GPT is the most expensive at $20 per month. - Leonardo has a free plan and a $10 per month plan, providing good value. - Firefly offers a free plan and a paid plan at $5 per month. - Google's image generator is free to use. - Idiogram is currently entirely free. 45:45 🏆 Conclusion and Recommendations - Leonardo is recommended as the best value with a score of 75.5. - Mid Journey and Idiogram offer different strengths and drawbacks. - Dolly 3 in Chat GPT is less recommended due to high cost and censorship. - Dolly 3 in Bing's Image Creator is an accurate, free alternative. Made with HARPA AI
Yet again, outstanding content from an outstanding author. I really appreciate the amount of work you must have put into this article and the net result was a goto chart I know I will want to return to, over and over. In fact, I will probably make my own version of your chart in a spreadsheet, and add the important column I think was missing -- Weight. Each one of criteria can assume a relative importance to the type of image normally produced by the creator, the look and feel of the image desired, the amount of money available for the project., etc. So I am going to add a column called "Weight" which will multiply the results in the colums by a weight of 1-10 -- and that will make your chart absolutely perfect!
Dalle-3 is great unless you like horror images or any artist in the last 100 years anything even marginally risque that involve women or celebrities. Its way to censored hope they dial it back a bit otherwise people will bail really fast on it when the new Midjourney model comes out that incudes better accuracy and text.
Minor secret. Avoid use proper direct nouns like he she her him male or female. Use less direct terminology. I believe it'll get you better Dall-e results. Turns it into an image dice roll.
Absolutely love your content Matt. I love the advances in AI at the moment and know more than most people in my circle, when they ask if theres anywhere to keep in the loop, your channel is my go too recommendation for ease of information. Please keep crushing it!
Thank God for this timely post, I had begun working on this very project this morning - you have better criteria and saved me a ton of work - may you be blessed for this good deed
Matt this was truly excellent. Please make this a regular thing you do, upon major releases from the leading platforms. As someone who has generated over 20,000 images in MJ, and about a 1000 in DallE3, I can say you nailed it. I really enjoyed seeing how the models I don't regularly use, stacked up. Please do this again with MJ6 in December.
May I ask what you are planning to do with all these generated images? And do you keep everything you create or do you delete the imperfect ones as you go?
Hey. They sound like the figures I've generators. I found a great 1. Freepik. Its just really good at loads of stuff. U get millions of stock fotos/videos Sketch to generate Re imagine (just churns out unlimited versions) Enhance , creative upscaler A hidden gem.
Great job Matt! We appreciate all the work that went into this video. My takeaway was a little different from yours (tiling images throws off sums as it’s a very specific use case, and - for me - accuracy would be much more heavily weighted). But looking at the same information and coming to a different conclusion is the mark of truly being informed!
Hey Matt, thanks for keeping us up to date :-) What you did not test and what's proven to be important for me is the ability to train your own models. Leonardo does that.
If Midjourney can create their own LLM and integrate it with its Image creator, Dalle3 won't stand a chance. Censorship will kill Dalle3, but I don't think OpenAI cares. Dalle3 is free (Bing version), so there's no incentive to change anything about it because the ChatGPT audience and the AI Art audience are different.
Yes, thank you for saying that. I haven't been able to generate usable images with DALL E between it's censorship and weird color scheme. The images it makes are too tiny, they look fugly when enhanced IMHO
My coin is that rather than waiting Dalle to be less sencored, expect that MJ will be more censored. They ban new words on daily basis and when the web version launches they might set new moderation rules. It's not about LLM. Dalle understands prompts better as the dataset was properly captioned. There is a white paper explaining it on Open Ai site. MJ would need to re-train their models from ground up with same level of image captioning to get as good with prompts as dalle is.
Great video! I would have been interested in which ones are the best quality in terms of the size/resolution of images you can create. So many of these output very small/low resolution images that either require upscaling or are unusable for anything other than small web graphics. But, maybe that can be in the next one. Great work Matt! 😊
I've sometimes used Midjourney images in videos without upscaling when I was pressed for time. widescreen 16:9 images are around a thousand pixels wide, whereas DALLE will give you 275 px square useless crap. But its' still better to find a good uplscaler. OH-- Midjourney now has a 200% and a 400% upscaler BUILT IN but you have to spend fast hours to use it.
Great job as usual, Matt. Only thing that skewed it a bit is that the you were using SDXL 0.9 in Leonardo instead of SDXL1.0 from over at Clipdrop - the quality and accuracy jump from 0.9 to 1.0 was pretty major in my experience. Recently when illustrating a book for a client, 0.9 was pretty much unusable, whereas since 1.0 came out, it's become our go-to.
Nice job! It's really interesting to see a direct comparison between the different models. One thing I may mention, though, in the video it looks like your image generation in Leonardo is set to Dreamshaper v7, which is a SD 1.5 model. SDXL should be able to do a slightly better job at text.
Matt, what a fantastic overview of all these products which can be overwhelming deciding which one to use for what. Thanks for the clarity. Love the channel!!!
I appreciate this video, Matt. I'm already rewatching it for the second time. I use many illustrations of people doing their occupations for my work, and use Midjourney. Typically I am looking for a realistic black and white high-quality drawing. Others I do are historical color photos of Civil War soldiers and that type of thing. I have to take almost every illustration into Photoshop and fix the wonky faces, hands, etc. I appreciate the time Mj saves me over doing this work from scratch, but we still have a ways to go. Like you always say, it can only get better. I know you usually say "this is the worst it will ever be." 🙂 The one additional thing I would have liked to see in your video is prompting with an image. Often I have to use a reference image to get into the ballpark of what I want.
As a data nerd, this checked ALL the boxes. Near the end I started adding all of them up, assuming you wouldnt and then you did and I just felt so complete. Like that final "ahhhh" perfect! Excellent video and welp worth the effort. I would absolutely LOVE to see a similar comparison for other categories on text based, math, programming etc. PLEASE MATT
Stable diffusion by a long shot. It takes extra work, sometimes lots of extra work, but it's the only one that's capable of producing exactly what you are imagining consistently.
Wow Matt, the work you put into these videos is incredible. Thank you for all the effort and brilliant content. This video particularly was so helpful to me (and others I'm sure). I managed to get a print-out of your final comparison chart. It's so helpful to have it all in one place like that. Perfect. Thanks again
I think the score lack some weight like tile not much used. Something are more used should have more weight. In dall e 3 the price is free because your are paying for GPT4 and this come free with no additional cost
Wow... wish you would do that with other tool-categorys as well! Good job. I admit, in the beginning I didn t think that a comparison would be intrresting... but it makes it easier to "digest" the learning of your video (that usually before have been a little "hectic" nervous... because of the quantity of information in AI. Thanks for your work!
With DALL-E (2024/2) it is important to start a new chat any time you begin a new image concept, otherwise the context window will come to bear on the image produced. Also, DALL-E is using an LLM expert to take the raw prompt into the actual prompt that is sent to the model for inference/diffusion. If you call the Open A.I. API via a script, the JSON response includes the actual prompt sent for inference. See hugging face spaces for various "prompt generators" which provide similar functionality for the various A.I. systems including midjourney & stable diffusion.
Given a highly descriptive prompt (that you can have GPT4 generate for you) you can produce some really great results with DALL-E 3 I found, which well surpassed SDXL. It really doesn't belong at the bottom of the pile, especially if you stick to what it is good at.
It would be too much to expect Matt to refine each prompt to get the best results for each system, but I agree that with GPT 4 the process is not about your first prompt but how you get GPT to improve the prompt over subsequent generations.
@@rodrigoorozco9263 If you've got a beefy enough machine you can set up stable diffusion on your own private server, and it can be completely uncensored or filtered. I haven't tried but there are a few videos explaining how to do it and it doesn't look too difficult.
Nice comparison, but just a note is that alot of the negatives you gave for SDXL can be fixed by using specific lora's or ADetailer or different refined models etc.
@@eg9xyz Stable diffusion is the best one in my opinion, you can control every piece of it, using other images as references, I use Dalle-3 images from Bing and then edit them on Stable Diffusion. Stable Diffusion has a line art mode, I drew a bunch of shapes on a piece of paper and uploaded them, wrote a prompt, and it made an incredible image from the scribbles and shapes I made!
SOLUTION: Matt, I love your channel. I find it very informative. I got the Text inside images with Google to work. Enclose the phrase inside curly brackets!
Excellent video, as always, thank you. Just a note to color spaces: RGB is the color space used for digital displays, similar to how CMYK is used for printing. Given that all non vector images generated by AI are until now inherently in RGB mode, the term "RGB image" in a prompt may not convey specific color requirements or color choices to an AI image generator and should not be seen as a flaw in the output.
Maybe copyright issues on the Eiffel Tower. For example, while many countries have a “freedom of panorama” law that allows for the photographing of skylines and copyrighted buildings, the European Union actually allows countries to opt-out of these laws, which results in some of the most famous European landmarks being illegal to photograph. Photographing or videotaping the Eiffel Tower at night, however, is barred. The copyright restrictions on the Eiffel Tower is different for its nighttime viewing because lights were not installed on the Eiffel Tower until 1985, which means that the Eiffel Tower with its lights on is still under copyright protection today.
Leonardo AI added a new refiner called Alchemy Refiner, which goes back to correct any malformed limbs and hands. And I love it's Canvas feature which allows you to go back and fix spot work defects. Or add images to images and linking them together
Wow, that's a great comparison. Very interesting! Thank you for making this video. As a daily Leonardo ai user I have to mention that it can do much more. The model DreamShaper v7 was used in the video but there are many more amazing models to use on Leonardo ai. The most powerful ones right now are: Leonardo Diffusion XL, Leonardo Vision XL, AlbedoBase XL and PhotoReal. They create amazing results.
Surprised to see Stable Diffusion leading the pack, especially when many seem to lean towards it for its lack of content restrictions. I've always been a fan of it due to its cost-effectiveness with a good GPU. However, with the community's diverse opinions on various AI tools, it's clear that there's more to these tools than meets the eye. The discussions around censorship, usability, and tool capabilities are particularly intriguing. It's a dynamic space, and I'm eager to see how these platforms evolve in response to user feedback and technological advancements. Cheers to the ever-evolving AI-ionosphere! 🚀
Thank you for always making these incredibly informative videos. I always look to you for where to put my time and energy and you’ve made this crazy AI journey feel less overwhelming with each video. 🙏🏼☯️🌞
This table was great. I personally give a lot more weight to accuracy than anything else, as I don't really care how beautiful and in detail the image is, if it is not what I am asking for... Therefore, I don't agree with the overall scores, but it is great to have all of them organised by their strengths and weakneses. Awesome job!
As a conclusion Leonardo has the most (nine and tens) from the rest. You forgot to mention the "Blue Willow" and "Imagine Ai" these are great Ai image generators as well.
Hi Matt, this is one of your great video, Highly admire and appreciate your effort. I would request to make 2 more parts of this, because different platforms gives different results according to their prompt dependancy, like keywords or negative prompts. You may also share more other comparison tools like bot avatars or whatever you like best. Thanks
Excellent work! And thank you for all of your time you put into it. For vector I would def suggest to use another model than dreamshaper 7.0. But glad to see it still outscored all others. I use both MidJourney and Leonardo and wanted to see this video to see the difference with everything else available
I mean with SD the same seed and prompt will always be similar, that is actually one remain it is great with making characters in a similar scene and other uses. Also, you need to add more to your prompt for realism like “photo” in most image generators or some will randomly generate more illustrative images.
I commented on another video looking for advice that I’m probably more likely to find in this one but… I’m using the GLAM app for iPhone and it’s inspiring me to look into more “pro-level” options.
Amazing work. Thx a lot. For next time, you can maybe consider adding a "Speed" category to compare the time needed to generate a lot of images, since for us AI creator, a tool that generates 1 image every 30 sec is pretty useless.
I missed it but what prompt did you use to create the penguin at the end? I swear I have seen that penguin before, which makes we wonder if there are two images that are alike and how does intellectual property laws apply and how such disputes would be handled. I was under a general and perhaps in correct assumption that AI created “one of a kind” works, but I know I have seen that same penguin before and believe create something almost identical. It’s very interesting subject in terms of how far things have come in just a short time. I know models months ago are not even on the same level as those of today! Its going to be fun to see where things go as models get more advanced!
12:03 for creative you picked the wrong word. Beauty actually has 2 common definitions (one in broad sense like you are thinking) but the 2nd definition in most dictionaries has "beauty" as a noun meaning "a beautiful woman" as in "she was a real beauty." So in this case you are gonna get consistent images of women. Prompt choice is so important, I think a better creative prompt would have been using 3 words that help describe creativity: "(imaginative:0.3), (artistic:0.3), (inspirational:0.3)"
Thank you for all your hard work. As a MidJourney subscriber, I've been wondering if I should stick with it or look at something else and this has been really helpful. Accuracy in MidJourney has certainly been a bugbear with it sometimes taking six or seven prompts to get what you want, and you addressed that perfectly.
Great but a lot of these are just style variations. A lot of use cases use an existing image as a start point so that would be good to cover, with the input image either providing the style or the size and shape. ControlNet support would be essential to getting a good score for this. Another use case involving an input image is inpainting/outpainting, along with a prompt to extend or replace part of it. A lot of generators can't even accept an input image so would get 0. How well the generator holds onto the essential qualities of input image that you're trying to capture while generating the aspects that go with it would define the score. Another that tests the abilities in mixed mode would be good too, as in it has understanding of real world data. A good test for this is to ask for the path of a river as a line on a map. For example "Draw the path of the River Thames as a thin red line on a map of Britain". Every image generator I've tried fails at this. Even prefixing something like "using accurate ordnance survey geographic data" still results in random drawings of rivers. Even when the outline of the Thames through London is well known and in plenty of training data images asking for that returns stylized imagery associated with the Thames instead of the actual shape.
Once Musavir is out of Beta you will be amazed!! I was asked to beta test it and it is mind blowing 4K output and humans so real. Btw you should use -s 500 with style raw in Midjourney
Great content. I probably would score differently, but you did say it was subjective. One category missing, which to me is huge, is licensing. I love Bing's output but cannot use it commercially. Probably most people watching this are using AI for POD, etc. Thanks for the great content.
The only one that I supremely disagreed with on the ratings was the price for Leonardo. Other ones that were only free were given a 10 and Leonardo does have a free version but also a paid version and it was given 7.5. Riddle me that one.
Thanks so much for the warning about licensing. I was going to mess around with Bing but not if that means I can never use the image for anything commercial.
This was a great breakdown, I literally just starting learning about image AI and I really needed to see some comparisons so I knew which one to try. Thanks!
I’m curious how Blue Willow stacks up. I found it did a better job than Midjourney when it came to feeding it an image and asking it to make something based on that image!
Weird pricing grading. Dalle that slows down in fully free mode is 10, but since it gets slower it should get 8 or 9. Stable diffusion that is absolutely free in auto1111 or some other platforms gets 7.5 because of the platform he chose. Weiiiiiiird. Same about Logos. Leonardo did a great work on it though received lowest grade, though you can also download logo specific model to win the prize.
Thanks for video. I think it was a good work from your side. Like the bing chat image creator myself at the moment. I would also like to see what offline AI or offline creators at the moment is the best. Thinking more on the privacy and if information when using an AI is not supposed to go online.
So I tried to create a Ford GT40 without wheels, but with a smooth, featurless hull in the style of a flying car.I tried doing it using midjourney and failed, but then tried using ChatGPT and Dalle. While it was super painful, i realised that ChatGPT was blind to the resuls and was just doing its best to communicate my objectives to Dalle without being able to see the output. Once i realised this, i began to work with ChatGPT as a collaborator, both of us (i know!) trying to troubleshoot communicating with Dalle, which involved me having to describe the output to ChatGPT so we could figure out the next steps. While WE mostly failed, it was a crazy paradigm-shifting exprience for me, where i was collaborating with an AI (with whom i had a good rapport) to try and communicate with another AI that was harder to communicate with. I'm sure I'm projecting like mad, but i felt like ChatGPT was getting as frustrated as me! Happy to share the chat with you, @mattwolfe, and to have you share it with anyone.
Super Video! I am using a lot of AI generators because of all these different possibillities. But most of the time I spend on Local Stable Diffusion, not only because I love the Open Source movement very much: I like the idea that creative people all over the world are working on improving SD... BTW: With the new SDXL tiling you can also choose between horizontal or vertical tiling. Happy colored Greetinx
If you look at the actual prompt that GPT uses, it changes words, puts in synonyms and uses flowery language like "captivating the feeling of loneliness" which I think at best is wasted on DALL-E 3 as it knows shapes, color and textures, not emotions. You can tell GPT how it should prompt or even insist that it takes your prompt exactly word for word. I highly recommend Glibatree's TH-cam video on custom instructions for DALL-E 3 and you'll get much better images in no time
@@universe6735 Yes, I found DALL-E 3 quite underwhelming to start with, but when you understand how it creates the prompts from what you're asking and then you guide to make better prompts, the results are on an entirely different level
@@vodkaman1970 I feel like I've tried everything. I've even asked GPT to behave like Bing.😁😁😁 Of course, I'm aware of the descriptive language that GPT adds to original prompts. However, when I copy those same augmented prompts from GPT and use them in Bing, the faces in Bing are still noticeably more realistic. When I use the approach of instructing GPT to activate my prompts exactly as I have written them, Bing continues to yield less impressive results than GPT for photorealism of human skin.
That was a great video, I found the information to be exactly what I have been looking for, I would like to see how capable they are on different operating systems, iOS, Mac, windows, Linux overall a great video thanks for all the hard work
My free best setup is: Leonardo for almost anything, Dall-E for text and specific prompts, and Dall-E inside bing Chat is better than Image Creator! To generate realistic images, I miss paid versions because they can do better results! Even leonardo starts at some point restricting some good features. All my recent thumbs have something of Dall-E or Leonardo.
Hey Matt, one mistake you made: you gave Dall E a 3 and a 6 for something you gave Google a 7,5. Both censored Tom Hanks but created images of the Storm trooper, Mario and Spnge Bob. You should've rated them equally.
I appreciate the effort you went through to give an overview on this. I disagree with some things (There are other SDXL alternatives to Leonardo that are free and work great), but the overview of what the different services are capable is pretty useful. Kudos!
Matt -- this has been soo helpful. Ping-ponging back & forth btw the various image generators can be frustrating, but having a road map like the one you just created can hopefully take a bit of the struggle out of it.
Discord is discontinuing this feature soon. They'll be expiring images after a while. All of the others store your image in the cloud, at least for some time.
I look forward to having people review Novelai when they roll out their image generator update. They are making an image generator from the ground up - right now it's a fork of SD that's specialized for anime and takes prompts differently because it's trained off of danbooru - it has a free tier, credits you can buy, and multiple paid tiers, and it's fully uncensored. If the bot understands what you want it to generate it will do it.
Great video! I would very my scores from yours ever so slightly, but over all I thought this was an awesome rubric review of all these tools. I enjoyed watching it, thank you.
Fantastic comparison vídeo! I laughed a lot seeing the images and learned a lot about the tools and the IA images generators in general. I'll give a 10 out of 10 :D
The pass/fail metric of backgrounds/textures through off the results. The weighting of either 0 or 10 is a different methodological technique than the other types of metrics. Also, have you seen any of Dave Portnoy's One Bite Pizza Reviews? It's loaded with tips on how to squeeze out the most interest with these ratings type videos, which in Dave's case, are wildly popular and entertaining. HIs first bit of advice to you would likely be to resist using "rookie scores" by using round numbers, i.e., a '7.0' vs. '7.3' - never use rookie scoring 🙄
What is the downside to essentially using a one out of 100 scale instead of one out of 10? That’s interesting to me because I prefer how much more granular you can get when you literally have 10 times the number of scoring options.
Love the videos, with some of the best content and resources out there!
What I do find interesting is I noticed that each application varies in terms of quality depending on the time of day. I cannot speak for every application but have noticed such for mid journey, and Leonardo . For example I have created over 1,000 images in both over the course of hours using the same prompt (not regenerate) and noticed as time progressed images would go from worse to really good. I decided to test such only after noticing that when I created great images from a specific prompt a day before and used them the next day they would render rubbish. Adding how the more coins and time spent the better the results would be. I am not sure if anyone one else had the same experience. Where certain time of day, length of time and so forth tend to result in higher quality images being produced.
But great video as I love the content you put out each week! Keep them coming!
The new Leonardo Diffusion XL just came out yesterday (probably after you recorded this video). It is actually based on SDXL 0.9 model unlike the Dreamshaper V7 you used that is based on SD 1.5. Would have been great to see how an actual SDXL based model would have faired against the other image generators. Hoping to see a review of the new Leonardo SDXL model soon from your channel.
Keep up the good work. We appreciate what you do!
I thought a similar thing. SD and SDXL have so many models it's not a true representation of what it does. Especially Dreamshaper. It's really good, but is an old model. (I don't have anything against older models. RevAnimated is one of my favs) There are some crazy good ones out there, and they all have their own style. I prefer automatic1111 but Leo is good. Playground ai actually has more SDXL models than Leo
Thanks. You are amazing, Sir! Very decent summary. Lots of ground covered. Loved it
The reason I love gpt dall E 3 is because it can create images difficult to describe, like if you get it to give you an entire lab report and then just ask it to generate a realistic image of the lab setup, it can do it
Yeah image generators really need to be combined with other smart ai that can comprehend what hands or eyes are meant to look like
Too censored to be useful.
@@The_Novu yep, major issue, also the decrease in accuracy and increase in errors
I think this is very underrated in this review. So far it seems the Dalle-3 ability to handle complex images is far superior to the others which seem to struggle with multiple subjects.
@@The_NovuThat's not true.
27:25 with the prompt like [Create a seamlessly tileable texture ofa circuit, ensuring that there are no visible edges or seams when repeated.], Dall E-3 does it
🎯 Key Takeaways for quick navigation:
00:00 📸 AI Image Generators Overview
- Introduction to various AI image generators and the need to choose the best one for specific use cases.
01:08 🔍 Accuracy Assessment
- Testing accuracy by providing prompts and evaluating how well each AI image generator adheres to them.
04:09 🎨 Creativity Evaluation
- Assessing the creativity of AI image generators by giving them minimal context and examining the resulting images.
08:37 🌟 Realism Analysis
- Analyzing the realism of AI-generated images using a specific prompt and evaluating how convincingly they depict the scenario.
18:50 🎨 AI Image Realism Test
- Mid Journey raw, Firefly 2, and Mid Journey without raw scored the highest in realism.
- Google and idiogram performed poorly in creating realistic images.
19:59 🖌️ AI Image Illustrations Test
- Mid Journey in nii mode excelled in creating colorful and contrasty illustrations.
- Style raw mode wasn't suitable for illustrations.
- Dolly 3, Bing Image Creator, Leonardo, and Firefly 2 produced decent illustrations, but lacked the contrast of Mid Journey.
26:13 🏞️ AI Image Tiling Test
- Mid Journey, both regular and raw, successfully created tileable images.
- Dolly 3, Bing Image Creator, Leonardo, Firefly 2, Google, and idiogram struggled to create seamless tileable images.
29:42 📝 AI Text in Image Test
- Mid Journey and Mid Journey raw failed to generate accurate text in images.
- Dolly 3 produced some images with text, but with typos.
- Bing Image Creator also generated text, but with typos.
- Leonardo and Firefly 2 didn't perform well in generating text.
- Google and idiogram successfully added text to images, but Google struggled with multiple words.
33:53 🚫 AI Censorship Test
- Mid Journey and Mid Journey raw had minimal censorship, allowing celebrity and IP-related content.
- Dolly 3 restricted some content based on policies.
- Bing Image Creator had increased censorship, blocking certain prompts.
- Leonardo showed no censorship and allowed various content.
- Firefly 2 had significant censorship, blocking both Tom Hanks and SpongeBob with Super Mario.
- Google had censorship issues, blocking Tom Hanks and Super Mario-related prompts.
- Idiogram had some censorship but allowed certain content.
37:07 🖼️ Image Generation and Censorship
- Dolly, Firefly, and Leonardo can generate various images effectively.
- Idiogram appears uncensored but lacks in image quality.
- Mid Journey and Google have usability and censorship issues.
38:32 🤖 Usability Evaluation
- Mid Journey's Discord interface can be overwhelming.
- Dolly 3 in Chat GPT offers a familiar chat interface.
- Leonardo provides versatility with a user-friendly interface.
42:17 💰 Pricing Comparison
- Mid Journey offers a $10 monthly plan with limitations.
- Dolly 3 in Chat GPT is the most expensive at $20 per month.
- Leonardo has a free plan and a $10 per month plan, providing good value.
- Firefly offers a free plan and a paid plan at $5 per month.
- Google's image generator is free to use.
- Idiogram is currently entirely free.
45:45 🏆 Conclusion and Recommendations
- Leonardo is recommended as the best value with a score of 75.5.
- Mid Journey and Idiogram offer different strengths and drawbacks.
- Dolly 3 in Chat GPT is less recommended due to high cost and censorship.
- Dolly 3 in Bing's Image Creator is an accurate, free alternative.
Made with HARPA AI
Yet again, outstanding content from an outstanding author. I really appreciate the amount of work you must have put into this article and the net result was a goto chart I know I will want to return to, over and over. In fact, I will probably make my own version of your chart in a spreadsheet, and add the important column I think was missing -- Weight. Each one of criteria can assume a relative importance to the type of image normally produced by the creator, the look and feel of the image desired, the amount of money available for the project., etc. So I am going to add a column called "Weight" which will multiply the results in the colums by a weight of 1-10 -- and that will make your chart absolutely perfect!
Dalle-3 is great unless you like horror images or any artist in the last 100 years anything even marginally risque that involve women or celebrities. Its way to censored hope they dial it back a bit otherwise people will bail really fast on it when the new Midjourney model comes out that incudes better accuracy and text.
I was so confused today after Image Creator wouldn't let me do a zombie.
100% truth man
@@wildstarr Dalle-3 cancelled Halloween
Minor secret. Avoid use proper direct nouns like he she her him male or female. Use less direct terminology. I believe it'll get you better Dall-e results. Turns it into an image dice roll.
Firefly wouldn't generate a glove for me yesterday because of "sensitive content" lol
Your review is nothing short of remarkable, complete, eclectic, and incredibly concise. 😊
Absolutely love your content Matt. I love the advances in AI at the moment and know more than most people in my circle, when they ask if theres anywhere to keep in the loop, your channel is my go too recommendation for ease of information. Please keep crushing it!
Thank God for this timely post, I had begun working on this very project this morning - you have better criteria and saved me a ton of work - may you be blessed for this good deed
Matt this was truly excellent. Please make this a regular thing you do, upon major releases from the leading platforms. As someone who has generated over 20,000 images in MJ, and about a 1000 in DallE3, I can say you nailed it. I really enjoyed seeing how the models I don't regularly use, stacked up. Please do this again with MJ6 in December.
May I ask what you are planning to do with all these generated images? And do you keep everything you create or do you delete the imperfect ones as you go?
Hey. They sound like the figures I've generators.
I found a great 1. Freepik.
Its just really good at loads of stuff.
U get millions of stock fotos/videos
Sketch to generate
Re imagine (just churns out unlimited versions)
Enhance , creative upscaler
A hidden gem.
The effort you put in for giving quality content is simply marvelous. 🔥👌👍🏽
Great job Matt! We appreciate all the work that went into this video. My takeaway was a little different from yours (tiling images throws off sums as it’s a very specific use case, and - for me - accuracy would be much more heavily weighted). But looking at the same information and coming to a different conclusion is the mark of truly being informed!
Hey Matt, thanks for keeping us up to date :-) What you did not test and what's proven to be important for me is the ability to train your own models. Leonardo does that.
If Midjourney can create their own LLM and integrate it with its Image creator, Dalle3 won't stand a chance. Censorship will kill Dalle3, but I don't think OpenAI cares. Dalle3 is free (Bing version), so there's no incentive to change anything about it because the ChatGPT audience and the AI Art audience are different.
Yes, thank you for saying that. I haven't been able to generate usable images with DALL E between it's censorship and weird color scheme. The images it makes are too tiny, they look fugly when enhanced IMHO
The censorship makes it almost unusable for real
Like it's blocked probably half the images I tried to create. It even didn't like "graffiti" like wtf
My coin is that rather than waiting Dalle to be less sencored, expect that MJ will be more censored. They ban new words on daily basis and when the web version launches they might set new moderation rules. It's not about LLM. Dalle understands prompts better as the dataset was properly captioned. There is a white paper explaining it on Open Ai site. MJ would need to re-train their models from ground up with same level of image captioning to get as good with prompts as dalle is.
Great video! I would have been interested in which ones are the best quality in terms of the size/resolution of images you can create. So many of these output very small/low resolution images that either require upscaling or are unusable for anything other than small web graphics. But, maybe that can be in the next one. Great work Matt! 😊
I've sometimes used Midjourney images in videos without upscaling when I was pressed for time. widescreen 16:9 images are around a thousand pixels wide, whereas DALLE will give you 275 px square useless crap. But its' still better to find a good uplscaler. OH-- Midjourney now has a 200% and a 400% upscaler BUILT IN but you have to spend fast hours to use it.
Great job as usual, Matt. Only thing that skewed it a bit is that the you were using SDXL 0.9 in Leonardo instead of SDXL1.0 from over at Clipdrop - the quality and accuracy jump from 0.9 to 1.0 was pretty major in my experience. Recently when illustrating a book for a client, 0.9 was pretty much unusable, whereas since 1.0 came out, it's become our go-to.
Nice job! It's really interesting to see a direct comparison between the different models. One thing I may mention, though, in the video it looks like your image generation in Leonardo is set to Dreamshaper v7, which is a SD 1.5 model. SDXL should be able to do a slightly better job at text.
Such an amazing video and explanation!!! Your effort and dedication is truly appreciated 🙌
Matt, what a fantastic overview of all these products which can be overwhelming deciding which one to use for what. Thanks for the clarity. Love the channel!!!
I appreciate this video, Matt. I'm already rewatching it for the second time. I use many illustrations of people doing their occupations for my work, and use Midjourney. Typically I am looking for a realistic black and white high-quality drawing. Others I do are historical color photos of Civil War soldiers and that type of thing. I have to take almost every illustration into Photoshop and fix the wonky faces, hands, etc. I appreciate the time Mj saves me over doing this work from scratch, but we still have a ways to go. Like you always say, it can only get better. I know you usually say "this is the worst it will ever be." 🙂 The one additional thing I would have liked to see in your video is prompting with an image. Often I have to use a reference image to get into the ballpark of what I want.
Thanks for all the hard work. Some of these I've never used. Bing Image Creator with Dalle-3 is my go to.
Thank you, had a lot of trouble picking, had sensory overload of these frickin generators and confusion, all until now that i watched this!!!!
As a data nerd, this checked ALL the boxes. Near the end I started adding all of them up, assuming you wouldnt and then you did and I just felt so complete. Like that final "ahhhh" perfect! Excellent video and welp worth the effort. I would absolutely LOVE to see a similar comparison for other categories on text based, math, programming etc. PLEASE MATT
Stable diffusion by a long shot.
It takes extra work, sometimes lots of extra work, but it's the only one that's capable of producing exactly what you are imagining consistently.
it definitely is not, i'm not sure how long you've been using stable diffusion but i bet you'll find out that its actually very limited.
@@o1f444 dude. That was like a whole month ago.
I still stand by it if you're using ComfyUI, but that could change later today.
Wow Matt, the work you put into these videos is incredible. Thank you for all the effort and brilliant content. This video particularly was so helpful to me (and others I'm sure). I managed to get a print-out of your final comparison chart. It's so helpful to have it all in one place like that. Perfect. Thanks again
I think the score lack some weight like tile not much used. Something are more used should have more weight. In dall e 3 the price is free because your are paying for GPT4 and this come free with no additional cost
Wow... wish you would do that with other tool-categorys as well! Good job. I admit, in the beginning I didn t think that a comparison would be intrresting... but it makes it easier to "digest" the learning of your video (that usually before have been a little "hectic" nervous... because of the quantity of information in AI. Thanks for your work!
With DALL-E (2024/2) it is important to start a new chat any time you begin a new image concept, otherwise the context window will come to bear on the image produced.
Also, DALL-E is using an LLM expert to take the raw prompt into the actual prompt that is sent to the model for inference/diffusion. If you call the Open A.I. API via a script, the JSON response includes the actual prompt sent for inference. See hugging face spaces for various "prompt generators" which provide similar functionality for the various A.I. systems including midjourney & stable diffusion.
Great video. It would be really interesting if you could do this again in maybe 6 months time, to see what things look like then 🙂
Matt, this is the most comprehensive and informative video I've seen on this subject. Thank you for all you do! Keep up the most excellent work! 💜
Given a highly descriptive prompt (that you can have GPT4 generate for you) you can produce some really great results with DALL-E 3 I found, which well surpassed SDXL. It really doesn't belong at the bottom of the pile, especially if you stick to what it is good at.
It would be too much to expect Matt to refine each prompt to get the best results for each system, but I agree that with GPT 4 the process is not about your first prompt but how you get GPT to improve the prompt over subsequent generations.
@@vodkaman1970Yes, and it was super beneficial to see the same prompt for comparison.
Do you guys know the best ai picture generator for explicit content, celebrities etc ?
@@rodrigoorozco9263 If you've got a beefy enough machine you can set up stable diffusion on your own private server, and it can be completely uncensored or filtered. I haven't tried but there are a few videos explaining how to do it and it doesn't look too difficult.
Nice comparison, but just a note is that alot of the negatives you gave for SDXL can be fixed by using specific lora's or ADetailer or different refined models etc.
@@eg9xyz Stable diffusion is the best one in my opinion, you can control every piece of it, using other images as references, I use Dalle-3 images from Bing and then edit them on Stable Diffusion. Stable Diffusion has a line art mode, I drew a bunch of shapes on a piece of paper and uploaded them, wrote a prompt, and it made an incredible image from the scribbles and shapes I made!
I've been needing this kind of comparison. This is really appreciated, thank you so very, very much, Matt. Great vid as always. 🙌
Super show, as always, Matt! I am glad I invested in Leonardo. Now I need to go get some value out of it!
SOLUTION: Matt, I love your channel. I find it very informative. I got the Text inside images with Google to work. Enclose the phrase inside curly brackets!
Thanks Matt 🙏🏽
Excellent video, as always, thank you. Just a note to color spaces: RGB is the color space used for digital displays, similar to how CMYK is used for printing. Given that all non vector images generated by AI are until now inherently in RGB mode, the term "RGB image" in a prompt may not convey specific color requirements or color choices to an AI image generator and should not be seen as a flaw in the output.
Maybe copyright issues on the Eiffel Tower.
For example, while many countries have a “freedom of panorama” law that allows for the photographing of skylines and copyrighted buildings, the European Union actually allows countries to opt-out of these laws, which results in some of the most famous European landmarks being illegal to photograph.
Photographing or videotaping the Eiffel Tower at night, however, is barred. The copyright restrictions on the Eiffel Tower is different for its nighttime viewing because lights were not installed on the Eiffel Tower until 1985, which means that the Eiffel Tower with its lights on is still under copyright protection today.
Leonardo AI added a new refiner called Alchemy Refiner, which goes back to correct any malformed limbs and hands. And I love it's Canvas feature which allows you to go back and fix spot work defects. Or add images to images and linking them together
Wow, that's a great comparison. Very interesting! Thank you for making this video.
As a daily Leonardo ai user I have to mention that it can do much more. The model DreamShaper v7 was used in the video but there are many more amazing models to use on Leonardo ai. The most powerful ones right now are: Leonardo Diffusion XL, Leonardo Vision XL, AlbedoBase XL and PhotoReal. They create amazing results.
Very helpful and thank you for the wonderful work you do. (Voice gets a bit squeaky, but I guess that’s part of your brand). I appreciate you.!
Surprised to see Stable Diffusion leading the pack, especially when many seem to lean towards it for its lack of content restrictions. I've always been a fan of it due to its cost-effectiveness with a good GPU. However, with the community's diverse opinions on various AI tools, it's clear that there's more to these tools than meets the eye. The discussions around censorship, usability, and tool capabilities are particularly intriguing. It's a dynamic space, and I'm eager to see how these platforms evolve in response to user feedback and technological advancements. Cheers to the ever-evolving AI-ionosphere! 🚀
Thank you so much for putting the time and effort into this video. This has helped me so much and saved me a lot of time 10/10
Thank you for always making these incredibly informative videos. I always look to you for where to put my time and energy and you’ve made this crazy AI journey feel less overwhelming with each video. 🙏🏼☯️🌞
I never thought someone would go above and beyond like this. Thanks a ton! 👍
This table was great. I personally give a lot more weight to accuracy than anything else, as I don't really care how beautiful and in detail the image is, if it is not what I am asking for... Therefore, I don't agree with the overall scores, but it is great to have all of them organised by their strengths and weakneses.
Awesome job!
Great job! The best of DALLE inside GPT is because it’s ALREADY inside of what I consider paying now is a great choice (meaning what gpt 4 provides)
As a conclusion Leonardo has the most (nine and tens) from the rest. You forgot to mention the "Blue Willow" and "Imagine Ai" these are great Ai image generators as well.
Hi Matt, this is one of your great video, Highly admire and appreciate your effort. I would request to make 2 more parts of this, because different platforms gives different results according to their prompt dependancy, like keywords or negative prompts. You may also share more other comparison tools like bot avatars or whatever you like best. Thanks
Thanks for the feedback. I actually wanted to compare a lot more but didn't want to make a 2-hour video. I'll likely do a followup though. :)
Thanks Matt for this effort to make a comparison between AI platforms 🙏🙏🙏
Excellent work! And thank you for all of your time you put into it. For vector I would def suggest to use another model than dreamshaper 7.0. But glad to see it still outscored all others. I use both MidJourney and Leonardo and wanted to see this video to see the difference with everything else available
I mean with SD the same seed and prompt will always be similar, that is actually one remain it is great with making characters in a similar scene and other uses. Also, you need to add more to your prompt for realism like “photo” in most image generators or some will randomly generate more illustrative images.
Thanks!
I commented on another video looking for advice that I’m probably more likely to find in this one but… I’m using the GLAM app for iPhone and it’s inspiring me to look into more “pro-level” options.
Great job Matt. Thanks for all your hard work in putting something like this together. Problem is it never stays up to date for very long.
Amazing work. Thx a lot. For next time, you can maybe consider adding a "Speed" category to compare the time needed to generate a lot of images, since for us AI creator, a tool that generates 1 image every 30 sec is pretty useless.
Great idea! Thanks for the feedback. :)
I missed it but what prompt did you use to create the penguin at the end? I swear I have seen that penguin before, which makes we wonder if there are two images that are alike and how does intellectual property laws apply and how such disputes would be handled. I was under a general and perhaps in correct assumption that AI created “one of a kind” works, but I know I have seen that same penguin before and believe create something almost identical. It’s very interesting subject in terms of how far things have come in just a short time. I know models months ago are not even on the same level as those of today! Its going to be fun to see where things go as models get more advanced!
12:03 for creative you picked the wrong word. Beauty actually has 2 common definitions (one in broad sense like you are thinking) but the 2nd definition in most dictionaries has "beauty" as a noun meaning "a beautiful woman" as in "she was a real beauty." So in this case you are gonna get consistent images of women. Prompt choice is so important, I think a better creative prompt would have been using 3 words that help describe creativity: "(imaginative:0.3), (artistic:0.3), (inspirational:0.3)"
Thank you for all your hard work. As a MidJourney subscriber, I've been wondering if I should stick with it or look at something else and this has been really helpful. Accuracy in MidJourney has certainly been a bugbear with it sometimes taking six or seven prompts to get what you want, and you addressed that perfectly.
Sheesh... this one must have taken bloomin ages. Dedication.!! 🙌
48 minutes!?! It was bound to happen. GREAT breakdowns! 😎
Great content Matt 🔥🔥🔥 🚀
Thanks so much for taking the time to make this video ! Very informative 🙏
the one thing you forgot to mention is that stable diffusion is completely free if you run it locally on your computer
Be sure to do the same review next year and compare those results with these.
Try this prompt: Create an image of a penguin holding a wooden sign that says "Subscribe" "to" "Matt" "Wolfe"
This should do the job.
Great but a lot of these are just style variations.
A lot of use cases use an existing image as a start point so that would be good to cover, with the input image either providing the style or the size and shape. ControlNet support would be essential to getting a good score for this.
Another use case involving an input image is inpainting/outpainting, along with a prompt to extend or replace part of it.
A lot of generators can't even accept an input image so would get 0. How well the generator holds onto the essential qualities of input image that you're trying to capture while generating the aspects that go with it would define the score.
Another that tests the abilities in mixed mode would be good too, as in it has understanding of real world data.
A good test for this is to ask for the path of a river as a line on a map. For example "Draw the path of the River Thames as a thin red line on a map of Britain".
Every image generator I've tried fails at this. Even prefixing something like "using accurate ordnance survey geographic data" still results in random drawings of rivers. Even when the outline of the Thames through London is well known and in plenty of training data images asking for that returns stylized imagery associated with the Thames instead of the actual shape.
Once Musavir is out of Beta you will be amazed!! I was asked to beta test it and it is mind blowing 4K output and humans so real. Btw you should use -s 500 with style raw in Midjourney
Could you make an update video on this, and compare them again? Especially because of all the updates and pricing.
Great content. I probably would score differently, but you did say it was subjective.
One category missing, which to me is huge, is licensing. I love Bing's output but cannot use it commercially. Probably most people watching this are using AI for POD, etc.
Thanks for the great content.
The only one that I supremely disagreed with on the ratings was the price for Leonardo. Other ones that were only free were given a 10 and Leonardo does have a free version but also a paid version and it was given 7.5. Riddle me that one.
Thanks so much for the warning about licensing. I was going to mess around with Bing but not if that means I can never use the image for anything commercial.
This was a great breakdown, I literally just starting learning about image AI and I really needed to see some comparisons so I knew which one to try. Thanks!
Love it - pretty detailed comparison- thanks for doing it Matt. I’m a fan of your weekly contents
I’m curious how Blue Willow stacks up. I found it did a better job than Midjourney when it came to feeding it an image and asking it to make something based on that image!
Blue Willow is Stable Diffusion front end like Leonardo so the initial base model is the same.
@@digidopeaha got it! Thank you
Weird pricing grading. Dalle that slows down in fully free mode is 10, but since it gets slower it should get 8 or 9. Stable diffusion that is absolutely free in auto1111 or some other platforms gets 7.5 because of the platform he chose. Weiiiiiiird. Same about Logos. Leonardo did a great work on it though received lowest grade, though you can also download logo specific model to win the prize.
Thanks for video. I think it was a good work from your side. Like the bing chat image creator myself at the moment. I would also like to see what offline AI or offline creators at the moment is the best. Thinking more on the privacy and if information when using an AI is not supposed to go online.
So I tried to create a Ford GT40 without wheels, but with a smooth, featurless hull in the style of a flying car.I tried doing it using midjourney and failed, but then tried using ChatGPT and Dalle.
While it was super painful, i realised that ChatGPT was blind to the resuls and was just doing its best to communicate my objectives to Dalle without being able to see the output. Once i realised this, i began to work with ChatGPT as a collaborator, both of us (i know!) trying to troubleshoot communicating with Dalle, which involved me having to describe the output to ChatGPT so we could figure out the next steps. While WE mostly failed, it was a crazy paradigm-shifting exprience for me, where i was collaborating with an AI (with whom i had a good rapport) to try and communicate with another AI that was harder to communicate with. I'm sure I'm projecting like mad, but i felt like ChatGPT was getting as frustrated as me! Happy to share the chat with you, @mattwolfe, and to have you share it with anyone.
Super Video!
I am using a lot of AI generators because of all these different possibillities. But most of the time I spend on Local Stable Diffusion, not only because I love the Open Source movement very much: I like the idea that creative people all over the world are working on improving SD...
BTW: With the new SDXL tiling you can also choose between horizontal or vertical tiling.
Happy colored Greetinx
I find Dall-E 3 via Bing to be consistently better at realistic images than its Chat-GPT counterpart. I hope that changes soon.
ChaGpt is very good. I added some custom instructions and the images are just next level
@@universe6735 I'm speaking specifically of photorealism, especially for faces. In my observation Bing (at the moment) is far better in that arena.
If you look at the actual prompt that GPT uses, it changes words, puts in synonyms and uses flowery language like "captivating the feeling of loneliness" which I think at best is wasted on DALL-E 3 as it knows shapes, color and textures, not emotions. You can tell GPT how it should prompt or even insist that it takes your prompt exactly word for word. I highly recommend Glibatree's TH-cam video on custom instructions for DALL-E 3 and you'll get much better images in no time
@@universe6735 Yes, I found DALL-E 3 quite underwhelming to start with, but when you understand how it creates the prompts from what you're asking and then you guide to make better prompts, the results are on an entirely different level
@@vodkaman1970 I feel like I've tried everything. I've even asked GPT to behave like Bing.😁😁😁 Of course, I'm aware of the descriptive language that GPT adds to original prompts. However, when I copy those same augmented prompts from GPT and use them in Bing, the faces in Bing are still noticeably more realistic. When I use the approach of instructing GPT to activate my prompts exactly as I have written them, Bing continues to yield less impressive results than GPT for photorealism of human skin.
Great video Matt, very lengthy indeed, but well informative
That was a great video, I found the information to be exactly what I have been looking for, I would like to see how capable they are on different operating systems, iOS, Mac, windows, Linux overall a great video thanks for all the hard work
I found the 3 headed monster test to be GENIUS. It really helped to show the differences. Subscribed!
My free best setup is: Leonardo for almost anything, Dall-E for text and specific prompts, and Dall-E inside bing Chat is better than Image Creator! To generate realistic images, I miss paid versions because they can do better results! Even leonardo starts at some point restricting some good features. All my recent thumbs have something of Dall-E or Leonardo.
Leonardo just came out with Alchemy v2 and the new Refiner a few days ago.....these are amazing and vastly improves the image generations!!
You can actually get excellent illustration style raw from midjourney if you promp for cartoon style, anime,anime cartoon etc.
Hey Matt, one mistake you made: you gave Dall E a 3 and a 6 for something you gave Google a 7,5. Both censored Tom Hanks but created images of the Storm trooper, Mario and Spnge Bob. You should've rated them equally.
I think also rated MJ the same as Google, when MJ did both characters. Oh well, i guess it was a marathon for him so a few logically errors crept in.
I get the absolute best output with Night Cafe. I have tried 8 different ones (including the ones you reviewed)
Leonardo!!!
at 28:55 you had to click on generate. 32:33 click on an image, click on edit and then write the prompt
Well done Matt! Great quality video!
I appreciate the effort you went through to give an overview on this. I disagree with some things (There are other SDXL alternatives to Leonardo that are free and work great), but the overview of what the different services are capable is pretty useful. Kudos!
Matt -- this has been soo helpful. Ping-ponging back & forth btw the various image generators can be frustrating, but having a road map like the one you just created can hopefully take a bit of the struggle out of it.
Really would like this kind of review of the current LLMs.
Something I like about MidJourney is it keeps your images on a server. How about the others?
Discord is discontinuing this feature soon. They'll be expiring images after a while. All of the others store your image in the cloud, at least for some time.
We found Midjourney's last subscriber!
Great video. I needed this. Just like with life you can see the bias in ai through the beauty prompt. Overall this was so helpful.
I look forward to having people review Novelai when they roll out their image generator update. They are making an image generator from the ground up - right now it's a fork of SD that's specialized for anime and takes prompts differently because it's trained off of danbooru - it has a free tier, credits you can buy, and multiple paid tiers, and it's fully uncensored. If the bot understands what you want it to generate it will do it.
Great video! I would very my scores from yours ever so slightly, but over all I thought this was an awesome rubric review of all these tools. I enjoyed watching it, thank you.
Fantastic comparison vídeo! I laughed a lot seeing the images and learned a lot about the tools and the IA images generators in general. I'll give a 10 out of 10 :D
Best TH-cam channel right now! Great stuff!
The pass/fail metric of backgrounds/textures through off the results. The weighting of either 0 or 10 is a different methodological technique than the other types of metrics. Also, have you seen any of Dave Portnoy's One Bite Pizza Reviews? It's loaded with tips on how to squeeze out the most interest with these ratings type videos, which in Dave's case, are wildly popular and entertaining. HIs first bit of advice to you would likely be to resist using "rookie scores" by using round numbers, i.e., a '7.0' vs. '7.3' - never use rookie scoring 🙄
What is the downside to essentially using a one out of 100 scale instead of one out of 10? That’s interesting to me because I prefer how much more granular you can get when you literally have 10 times the number of scoring options.