Boost YOLO Inference Speed and reduce Memory Footprint using ONNX-Runtime | Part-1

แชร์
ฝัง
  • เผยแพร่เมื่อ 1 ม.ค. 2025

ความคิดเห็น • 6

  • @gomgom330
    @gomgom330 7 วันที่ผ่านมา

    Bro, why when i run onnx model with uint8 quantize, that i quantize from onnxruntime with dynamic quatntize, it slower than default(float32) onnx model?? Btw i export from .pt ultralytics and inference it with ultralytics too not onnxruntime

    • @swaymaw
      @swaymaw  7 วันที่ผ่านมา

      I will have to test it out once for myself, which model are you facing this issue with?

    • @swaymaw
      @swaymaw  7 วันที่ผ่านมา

      From what I got from your query I think you are importing the quantized onnx file using ultralytics YOLO API, However I don't think YOLO might have the needed optimization to improve inference speed using UINT8 weights and might be treating it same as FLOAT32 try running the same inference using ONNX-RUNTIME API and see if you get any improvements also I think it might be slower as ONNX file must be not much optimized to read for YOLO pipeline.

    • @gomgom330
      @gomgom330 7 วันที่ผ่านมา

      ​@@swaymaw So, i better use onnxruntime api and create a pipeline from scratch instead YOLO API?? i have no idea how to make the pipeline, i rewatch this video till end but still don't know how to create a pipeline with onnxruntime for my task (my task is object counting, btw)

    • @swaymaw
      @swaymaw  7 วันที่ผ่านมา

      @gomgom330 I will be creating a repo moving ahead in this series for various tasks available in ultralytics for instance object counting is just a use case for object detection this was just the first part once we are done with detection model pipeline in onnx then you should be able to use the repo directly to run object detection with onnx the same way you do with ultralytics that's the end goal

    • @gomgom330
      @gomgom330 7 วันที่ผ่านมา +1

      Sure, bro! You got a new subscriber! Is it possible for you to make a video series on inference models using ONNX for your next project? That way, we wouldn’t have to rely on the Ultralytics YOLO API, since it’s pretty rare to find videos about inference for edge devices.