Fine tune florence-2 for Object detection task

แชร์
ฝัง
  • เผยแพร่เมื่อ 6 ก.ย. 2024
  • Learn to fine tune florence-2 model for object detection task.
    GitHub: github.com/Aar...
    Dataset: universe.robof...
    This is a step by step tutorial.
    1- Dataset preparation for florence-2 model. We will prepare the annotations which florence2 model accepts.
    2- Finetune the model to perform custom object detection.
    3- Inference on unseen data.
    #computervision #objectdetection #finetuning #ai #artificialintelligence

ความคิดเห็น • 22

  • @habbathejut
    @habbathejut หลายเดือนก่อน

    great work, thank you for the video!

  • @Mamunur-illini
    @Mamunur-illini 2 หลายเดือนก่อน

    Happy to see you back on TH-cam.
    Could you please make a comparison videos of all the models for object detection please? Thank you.

    • @CodeWithAarohi
      @CodeWithAarohi  2 หลายเดือนก่อน

      Thank you! Sure, I will do a video on comparison.

    • @fouziaanjums6475
      @fouziaanjums6475 2 หลายเดือนก่อน

      ​@@CodeWithAarohi Hi mam,request you to please make a comparison video on image classification using various transformer models too...

  • @TapanSingh-z3j
    @TapanSingh-z3j 16 วันที่ผ่านมา

    Hii Aarohi , Very Good Explaination.But I want to ask you one question that after finetuning the model lost its previous task ability?Because in pretrained we was able to see 'tag'.But after finetuning it doesnot showing that.Kindly answer this because I want to use this model.Thanks in advance,

  • @nakulmali1413
    @nakulmali1413 2 หลายเดือนก่อน

    Thanks for topic explanation Mam please upload video on how to combine Yolov5 object detection model and classification model. Thanks in advance

  • @abdulmeral4811
    @abdulmeral4811 26 วันที่ผ่านมา

    hi Aarohi, thank you for great example! and
    did u have opportunity to compare performance between yolov10 and florence?

  • @litziadrianacruz7583
    @litziadrianacruz7583 หลายเดือนก่อน

    How image resolution is handled as a hyperparameter in this model?

  • @TusharKamle-n5w
    @TusharKamle-n5w หลายเดือนก่อน

    Hey Aarohi! I loved your work on Florence-2 finetuning. Do you know how we can train this Florence-2 model for OCR purposes only? What should our dataset for training look like, and how much change do we need to make in the inputs?

    • @CodeWithAarohi
      @CodeWithAarohi  หลายเดือนก่อน

      I haven't tried this part yet.

  • @abdelrahimkoura1461
    @abdelrahimkoura1461 2 หลายเดือนก่อน

    Firstly, thank you for your beautiful explanation. Secondly, if you can put a link to custom data set it or allow it to be downloaded from the Google Drive to execute the cod thanks again

    • @CodeWithAarohi
      @CodeWithAarohi  2 หลายเดือนก่อน

      Dataset: universe.roboflow.com/universiti-malaysia-pahang-qcvas/objectdetection-ngxjp/dataset/5

  • @KhloodRashad
    @KhloodRashad หลายเดือนก่อน

    Can I do housework alert system using artificial intelligence

  • @Satchi017
    @Satchi017 หลายเดือนก่อน

    Thank you for the explanation. However, how can this be considered automated image annotation for object detection if we used 2379 images for training and 123 images for the validation dataset, which were all manually annotated?

  • @rishabhsheoran6959
    @rishabhsheoran6959 หลายเดือนก่อน

    Hey Aarohi! Love your explanation. Can you pls make a video on custom Action Recognition (Human actions, Human-Human, Human-Object)? Are these possible using a single model?

    • @CodeWithAarohi
      @CodeWithAarohi  หลายเดือนก่อน

      Sure, After finishing my pipelined work.

  • @mohammadyahya78
    @mohammadyahya78 2 หลายเดือนก่อน

    do you think it's better than YOLOv8? What usages might lead us to use this one please? Given the model should work in real time?

    • @CodeWithAarohi
      @CodeWithAarohi  2 หลายเดือนก่อน +2

      Florence-2 is a lightweight vision-language model and you can fine-tune it across tasks like captioning, object detection, grounding, and segmentation. Being vision-language oriented, it might excel in tasks where understanding textual context with visual data is crucial.
      YOLOv8 is a popular model specifically designed for object detection, segmentation, and classification tasks. It is known for its speed and accuracy in real-time object detection. If your primary focus is on tasks such as real-time object detection then YOLOv8 would be a strong choice.
      There is no universally "best" model between Florence-2 and YOLOv8. The decision should be based on your specific use case, performance requirements, and deployment constraints.

  • @velugucharan8096
    @velugucharan8096 2 หลายเดือนก่อน

    madam how to perform person reidentification when cctv are arrange in shopping hall can make one video the person how are making unwanted things in shopping hall i want to identify that particular person can make one video please

  • @viveksaini1497
    @viveksaini1497 2 หลายเดือนก่อน

    Mam I need deep learning notes of your video ,deep learning playlist