Glad you're keeping track of this, but image analysis doesn't use DALL-E in the background. Omni is a fully multi modal model, and it analyzes image embeddings directly with the embedding CNN trained in parallel with the main model. It uses DALL E right now to make images but not to analyze them. In fact, 4o can make images directly, but as expected, OpenAI isn't planning on releasing that capability any time soon.
Very easy to follow. Good stuff!
Great stuff. Thank you
Our pleasure!
Glad you're keeping track of this, but image analysis doesn't use DALL-E in the background.
Omni is a fully multi modal model, and it analyzes image embeddings directly with the embedding CNN trained in parallel with the main model.
It uses DALL E right now to make images but not to analyze them.
In fact, 4o can make images directly, but as expected, OpenAI isn't planning on releasing that capability any time soon.
Thanks for the clarification.