Go beyond Text with Google APIs

แชร์
ฝัง
  • เผยแพร่เมื่อ 29 ก.ย. 2024
  • Join Raphaël Semeteys, DevRel at Worldline, in the fourth episode of the tutorial series "GenAI's Lamp," focusing on Generative Artificial Intelligence. This episode is dedicated to Google APIs to manipulate text and image in a multimodal way.
    🔍 What's Inside:
    - Demonstrations of imagen model's image generation and imagetext model's image captioning capabilities.
    - Use of visual question answering to interact with images and receive relevant answers.
    - Introduction of Gemini Pro and Gemini Pro Vision models from Google DeepMind for text, image, and potentially audio and video reasoning.
    - Showcasing Gemini Pro for combining text and code input, and Gemini Pro Vision for instructions derived from images.
    - Exploration of multimodal embedding for advanced applications like image classification and search.
    - Conclusion highlighting the evolution of AI models towards multimodal functionalities and future prospects.
    🔗 Associated content
    github.com/wor...
    📚 Resources of the video
    jupyter.org
    cloud.google.c...
    🔗 Follow Raphaël
    dev.to/raphiki/
    github.com/rap...
    / raphaelsemeteys
    / raphaelsemeteys
    www.semeteys.org
    🔗 Follow us
    blog.worldline...
    / worldlinetech

ความคิดเห็น •