So far I've given it 4 requests and it has come back with "Sorry, I wasn't able to generate the images you requested". Nothing difficult - for example put a face on this banana.
@dhruvbnaik we need to hire many workers to create a labeled dataset, which is kinda expensive, however with gemini we can just create a generated dataset and do verify with humans or so
Keep in mind you can prompt with more than just text. You can prompt a bounding box and ask to generate an image with cat inside the bounding box. Repeat a million times and now you have a million photos of cat and known bounding box.
So, I'm guessing this image generation feature isn't available to regular users at the moment? I tried using Gemini 2.0 Flash Experimental in AI Studio to generate some images, but it kept just describing the image instead of creating it.
A few hours after the video came out, it claimed to be creating images, but the imgur links it gave me were blank. I figured I'd give it a couple of days to iron out whatever issue that was, but now it says it doesn't do images at all. I'm disappointed. I was excited. It now says that it may have hallucinated an ability to generate images. We're living in interesting times for computers.
I see the incredible capability of the model!. The model is exactly following the instruction with very good accuracy, latency and quality of image. Hatts off to the research team. 👍👍
I tried to do pretty much the same thing with a picture from my laptop, and it doesn't generate anything, it gives me text, especially on Google AI Studio and the Gemini web app.
@@darcos-i6sThey introduced multimodal features (like single model image generation) back in May. But they still haven't released it. Search for "hello gpt-4o"
@@hetthummar9582 I have the model in my studio, it never succeeds in generating images for me tho, either confused thinking it doesnt have the capability - or tries and i get content warnings on the most innocent of requests (sunset over ocean, futuristic font, whatever).
When I ask it to do this, it says "sure!" and then hallucinates an imgur url . The model I'm using is "Gemini 2.0 Flash Experimental". I guess I am not one of the early testers who gets the new output modalities. Is there some way to see this from aistudio, or do we just have to try and see if it works?
And with this image generation has been perfected after 3.5 years. Now it is possible to create any image with character consistency. I wonder how long it would take to perfect video with sound to get movie creation
From Google's blog: "These new output modalities are available to early testers, with wider rollout expected next year." So my question to Google is, why do they use phrases like "It can NOW natively generate images...blah blah."
Hey guys one of the early tester here! The model is fantastic to play with but Idk when I am generating an image it doesn't show it visually just gives me some bunch of code what should I do about that?
i'd love to see it switch the POV of the car to inside the driver's seat whilst flying through the clouds. The creativity you can have if this is possible is immeasurable.
I'm scared for the future of our national security. This is the voice of our nation's men now. They are not strong enough to defend our country. What would you need to do to an adult man to make him sound like this? Not good.
I would normally agree, but they just shipped their realtime voice API, and it's pretty impressive. I have a little bit more confidence now that what they are saying they can do here is actually possible and will be made available, but let's see
I tried this literally with the same exact car and prompt and it couldn't do. The multimodal live AI thing is working crazy good though. Google is cooking
I dont think it will still work on humans in the image:/ this is kinda not that good because of it , sincealot of images have humans in it even if not intentionally
Oh woooow this is realllllllly willing to hallucinate answers. Just gave it a quick try, and asked it about a bunch of stuff that either doesnt exist or it doesnt have knowledge of, and it will just go oooon and oooon inventing stuff. Would not use for most things.
This is the calmest advertisement for something so game changing I've ever seen
So far I've given it 4 requests and it has come back with "Sorry, I wasn't able to generate the images you requested". Nothing difficult - for example put a face on this banana.
I'm so surprised this hasn't gotten a lot of attention yet... but i mean it has only been 20 minutes
im already going insane
Did it work for you?
@@austriasdaughterssons3617 it's not editing images for me, it fails and hallucinates
20 minutes these days is closer to 20 hours pre-AI time. perfectly understandable
just tried, didn't work
The most expansive part of training CNN models is labeling. This is a game changer for generating ground truth data for robotics and self driving.
How does this change labelling issues? Am I missing something?
@dhruvbnaik we need to hire many workers to create a labeled dataset, which is kinda expensive, however with gemini we can just create a generated dataset and do verify with humans or so
Keep in mind you can prompt with more than just text. You can prompt a bounding box and ask to generate an image with cat inside the bounding box. Repeat a million times and now you have a million photos of cat and known bounding box.
@@zyang056 isn’t that just based off previous labelling?
So, I'm guessing this image generation feature isn't available to regular users at the moment? I tried using Gemini 2.0 Flash Experimental in AI Studio to generate some images, but it kept just describing the image instead of creating it.
Same here !!
Same, I haven't been able to find it in aistudios. I can't wait!
0:35 It seems that it is not open to the general public yet.
A few hours after the video came out, it claimed to be creating images, but the imgur links it gave me were blank. I figured I'd give it a couple of days to iron out whatever issue that was, but now it says it doesn't do images at all. I'm disappointed. I was excited. It now says that it may have hallucinated an ability to generate images. We're living in interesting times for computers.
@@VtuberSpace early testers within AI studio is what we thought it meant... I guess not.
The combination/blending feature is incredible. I can imagine many use cases for creatives.
Wow, omnigen finally has some competition !!!! So excited
omnigen is bad quality ime
This is very impressive, especially considering the simplicity of the input prompts
If it worked it would be.
Okay that’s completely next level
I see the incredible capability of the model!. The model is exactly following the instruction with very good accuracy, latency and quality of image. Hatts off to the research team. 👍👍
I tried to do pretty much the same thing with a picture from my laptop, and it doesn't generate anything, it gives me text, especially on Google AI Studio and the Gemini web app.
Same for me (Germany). Only generates text
@@dominik4496 yup same this it give me code not image output
@@simtangaranvijay273 it sometimes give you the image but the image is blocked by google content moderation : (
because this feature is not available yet, and will be released next year. Currently only available to early testers.
US and UK only for now, I believe.
It is not currently working with the Gemini 2.0 Flash Exp model on AI Studio.
ur right I tried it as well just returns a bunch of meta data instead of an image
same
Read video description:
"These new output modalities are available to early testers, with wider rollout expected next year."
@@somthingz3928We got access to Gemini 2.0 as part of the plan but not these features
Ho do we become early testers?
I found no way to do this in AI studio. Gemini says it cannot generate images. In the Gemini app it uses Imagen.
WOW! 4o was announced months ago with still no release, and Gemini just flashed through! 🎉
I’m blown away
what? 4o? maybe o1? and it is available now (not preview)
@@darcos-i6sThey introduced multimodal features (like single model image generation) back in May. But they still haven't released it.
Search for "hello gpt-4o"
@@darcos-i6s 4o was announced months ago and we still dont have image or video capabilities
Not really, this isn’t available till next year
@@maddog2622 it is availabe on ai studio
We can't do it yet in ai studio??
Apparently not until next year
@@dominik4496 yeah.. not available
@@hetthummar9582 I have the model in my studio, it never succeeds in generating images for me tho, either confused thinking it doesnt have the capability - or tries and i get content warnings on the most innocent of requests (sunset over ocean, futuristic font, whatever).
When I ask it to do this, it says "sure!" and then hallucinates an imgur url . The model I'm using is "Gemini 2.0 Flash Experimental". I guess I am not one of the early testers who gets the new output modalities. Is there some way to see this from aistudio, or do we just have to try and see if it works?
Same here. Most likely we're not the early testers they speak of.
That’s the real gpt4o right there
Thank for sharing
So the general public can't use these features yet? The image editing looks absolutely amazing!
And with this image generation has been perfected after 3.5 years. Now it is possible to create any image with character consistency. I wonder how long it would take to perfect video with sound to get movie creation
Amazing and frightening speed that we're already at the "are you tired of complex prompts?" phase already...
pixel 9 users will get the build early?
Do we have Api's for it ?
From Google's blog: "These new output modalities are available to early testers, with wider rollout expected next year."
So my question to Google is, why do they use phrases like "It can NOW natively generate images...blah blah."
Yay, I now wanna try prompting it to draw the chain of thought instead of writing it.
I concur.
i want try it draw the screenshot with the mouse click instead of say the positions.
is the image editing live yet? I cant fine this feature?
i don't think so, I was also searching for same.
This is literally blowing my mind.
Hey guys one of the early tester here! The model is fantastic to play with but Idk when I am generating an image it doesn't show it visually just gives me some bunch of code what should I do about that?
then you're not a early tester lol
How... how can this only have 30k views after a day?
That's some impressive stuff.
I’d like to see it do that, with the car at 45 degrees.
i'd love to see it switch the POV of the car to inside the driver's seat whilst flying through the clouds. The creativity you can have if this is possible is immeasurable.
That... Is actually really impressive. Can you do that through API?
Not working as of now, it stuck on infinite buffering
i am trying i am not getting whats wrong
This is definitely rf inversion in practice
finally this might be the one which i've wanted for a long time.
Google IS coming back ! 👍
I try it and it doesn't work. The answer is an HTML code...Did someone try it?
Same here.
This is really imagining!
this is gonna hurt photoshop a bit
This still cannot be done in Google AI Studio, let them put things that can be done now.
THEY'RE NOT RELEASING THIS UNTIL NEXT YEAR WTFF??
Calm down! It's just two weeks for hitting the new year!
Quisiera que amplíes con información sobre el uso de la API de Gemini 2.0
i’m star struck!
no this does not work i wnated this feature so badly i am keep on trying its not working
I'm scared for the future of our national security. This is the voice of our nation's men now. They are not strong enough to defend our country. What would you need to do to an adult man to make him sound like this? Not good.
Fake, the image editor doesnt work. It allucinates uploading the resulting image to imguir and shows code. No the edited image
Thank you.
More good demonstrations Google will never ship.
I would normally agree, but they just shipped their realtime voice API, and it's pretty impressive. I have a little bit more confidence now that what they are saying they can do here is actually possible and will be made available, but let's see
Im in google a.i studio and using gemini 2.0,
how can i access the voice api?
thanks in advance
That wasn't what I was expecting when you said cat on the skate board
I tried this literally with the same exact car and prompt and it couldn't do.
The multimodal live AI thing is working crazy good though. Google is cooking
I dont think it will still work on humans in the image:/ this is kinda not that good because of it , sincealot of images have humans in it even if not intentionally
whoa ... is this sorcery?
This is next level!
Very exciting
This is almost scary. How will humanity handle this when the lines between reality and fiction is invisible?
meet people in real life and get off the internet
It doesn’t do anything with images and says it can’t manipulate images. It’s amazing to me how Gemini literally never works
better than gemni 1.5
this is amazing
Give me agents and AIOS.
actually cool
exciting times
Goodbye Photoshop🖐
Google seems pretty back.
For this to happen, it would have to be published promptly - I suspect OpenAI is not sleeping. We need Gemini for Android Auto and Pixel Watch 👍🏼
🍀🍀🍀🍀🍀🍀🍀🍀🍀🍀🍀🍀
and people said google is lacking behind
Just postpone your subscribe to any AI tools. Just wait for this gemini
Holly molly
牛逼
good job, now make it free
daim open ai be gettin smoked rn
Oh woooow this is realllllllly willing to hallucinate answers. Just gave it a quick try, and asked it about a bunch of stuff that either doesnt exist or it doesnt have knowledge of, and it will just go oooon and oooon inventing stuff. Would not use for most things.
Vaporware