How can LLMs improve Vision AI? OCR, Image & Video Analysis

แชร์
ฝัง
  • เผยแพร่เมื่อ 15 มิ.ย. 2024
  • Combine vision and language in an AI model with the latest vision AI model in Azure Cognitive Services. Use natural language to fetch visual content in images and videos without needing metadata or location, generate automatic and detailed descriptions of images using the model’s knowledge of the world, and use a verbal description to search video content.
    Cognitive Service for Vision AI combines both natural language models (LLM) with computer vision and is part of the Azure Cognitive Services suite of pre-trained AI capabilities. It can carry out a variety of vision-language tasks including automatic image classification, object detection, and image segmentation. Similar to GPT, the foundational language model, Project Florence, used in this case infuses deeper language skill with vision analytics to make training, inferencing and interacting with your image and video content simpler using natural language.
    Azure Expert, Matt McSpirit shares how to customize the model and use these capabilities in your own apps.
    ► QUICK LINKS:
    00:00 - Introduction
    00:48 - Project Florence
    01:52 - Open-world recognition
    03:19 - Dense captioning
    04:23 - Run frame analysis
    05:02 - Train a custom model
    06:29 - Build custom apps
    07:41 - Wrap up
    ► Link References:
    Check out aka.ms/CognitiveVision
    ► Unfamiliar with Microsoft Mechanics?
    As Microsoft's official video series for IT, you can watch and share valuable content and demos of current and upcoming tech from the people who build it at Microsoft.
    • Subscribe to our TH-cam: / microsoftmechanicsseries
    • Talk with other IT Pros, join us on the Microsoft Tech Community: techcommunity.microsoft.com/t...
    • Watch or listen from anywhere, subscribe to our podcast: microsoftmechanics.libsyn.com...
    ► Keep getting this insider knowledge, join us on social:
    • Follow us on Twitter: / msftmechanics
    • Share knowledge on LinkedIn: / microsoft-mechanics
    • Enjoy us on Instagram: / msftmechanics
    • Loosen up with us on TikTok: / msftmechanics
    #LLM #CognitiveServices #OpenAI #Azure #chatgpt
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 8

  • @peterbailey2325
    @peterbailey2325 ปีที่แล้ว +7

    I asked Bing chat what comment I could make to this video, here are it's answers:
    I can help you with that. Here are some possible comments you can make to the video:
    This video is amazing! I learned so much about how to combine vision AI with language AI for OCR, image and video analysis. Thanks for sharing this Microsoft Mechanics!👏
    I’m impressed by the capabilities of Azure Cognitive Service for Vision. It can do so much with just natural language queries and no metadata or location needed. The Project Florence demo was mind-blowing!🤯
    Wow, this is the future of computer vision! I love how you can use OCR to extract text from images and videos, and how you can use facial recognition to verify identities. The custom models feature is also very cool and easy to use. Great work Microsoft Mechanics!👍
    This is one of the best videos I’ve seen on vision-language AI. It shows how powerful and versatile Azure Cognitive Service for Vision is, and how it can handle different tasks like open-world recognition, dense captioning, and frame analysis. The spatial analysis feature is also very useful for understanding people’s movements in physical spaces. Awesome video Microsoft Mechanics!🙌
    I’m blown away by this video! It shows how Azure Cognitive Service for Vision can merge vision AI with language AI for OCR, image and video analysis. It can understand natural language queries and generate detailed descriptions of images and videos. It can also train custom models with just a few images and no machine learning experience required. This is truly state-of-the-art computer vision! Kudos to Microsoft Mechanics!👏👏👏

  • @extrememike
    @extrememike ปีที่แล้ว

    This is simply amazing! Wonder if all these are value added features to existing Azure customers or what is the cost structure?

  • @ifty7017
    @ifty7017 2 หลายเดือนก่อน

    that was really helpful. thanks

  • @joseville
    @joseville 5 หลายเดือนก่อน

    7:35 note to self: this is using `using` declarations. "When declared in a using declaration, a local variable is disposed at the end of the scope in which it's declared."

  • @PubgSpeed-wl8yo
    @PubgSpeed-wl8yo 5 หลายเดือนก่อน

    Please do more examples, and examples with a hint of business, much more videos. Very few live examples, it's slowing down. Make a video on how you can integrate it with messengers, web sites and web applications.

  • @iPhonePhan
    @iPhonePhan 11 หลายเดือนก่อน

    The rate of innovation is beyond breakneck speed! Literally having a hard time keeping up with MS 😅 🏃🏽‍♂️🏃🏽🏃🏽‍♀️💨

  • @peterbasta415
    @peterbasta415 5 หลายเดือนก่อน

    is there a tool i can use to only extract for example " ingredients " section from the product label ?

  • @baalamdovberlavanterahsons6369
    @baalamdovberlavanterahsons6369 ปีที่แล้ว

    Microcloud