C 7.9 | Fast RCNN Network, Computation Time, Accuracy | CNN | Object Detection | Machine learning

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 ก.ย. 2024
  • Fast RCNN is an improvement on top of SPPNet.
    First change they did was to remove the multilevel pooling in the SPP layer. Instead they use a single 7x7 grid for SPP.
    Next, they realized that whether you use Softmax or SVM for classification, your accuracy is more or less same. So they got rid of the SVM classifier.
    Instead of separately training the Classifier first and then the BBox regressor as is usually done, they combined the losses from both and fine tuned the network upto the L3 conv layer.
    Why just L3? As we know, the initial layers of ConvNet extract some generic features and there would not be much to be gained by fine tuning these layers.
    Also, they made the FC6 and FC7 layers common to both Classifier and BBox Regressor. And they added an extra FC layer each to both of them.
    Lastly, they used Smooth L1 loss in BBox Regressor instead of L2 loss. If you want to know more about these loss functions, see: heartbeat.frit...
    With these changes, you get the Fast RCNN network. This network happens to be 143 times faster than RCNN. While SPPNet is around 20 times faster than RCNN.
    So, in terms of speed there has been a major improvement.
    However, on the accuracy front, there is not much change. In RCNN, the accuracy is 66%, here it is 66.9%
    ------------------------
    This is a part of the course 'Evolution of Object Detection Networks'.
    See full playlist here: • Evolution Of Object De...
    ------------------------
    Copyright Disclaimer: Under section 107 of the Copyright Act 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.

ความคิดเห็น • 20

  • @jyotisekhar9018
    @jyotisekhar9018 3 ปีที่แล้ว +1

    Very nicely explained, its helping me a lot for my MSc thesis.

  • @Cogneethi
    @Cogneethi  5 ปีที่แล้ว +1

    See full course on Object Detection: th-cam.com/play/PL1GQaVhO4f_jLxOokW7CS5kY_J1t1T17S.html
    If you found this tutorial useful, please share with your friends(WhatsApp/iMessage/Messenger/WeChat/Line/KaTalk/Telegram) and on Social(LinkedIn/Quora/Reddit),
    Tag @cogneethi on twitter.com
    Let me know your feedback @ cogneethi.com/contact

  • @ilyasaroui7745
    @ilyasaroui7745 3 ปีที่แล้ว +1

    Hello thank you for the great explication. There is answer I am looking for and I couldnt find an answer anywhere. If the feature map is not the same size as the input image, how can we map the coordinate of a proposed region on the feature map ?!

    • @Cogneethi
      @Cogneethi  3 ปีที่แล้ว +1

      You can use ROI projects: th-cam.com/video/wGa6ddEXg7w/w-d-xo.html

  • @vinayprakash1687
    @vinayprakash1687 4 ปีที่แล้ว

    In SPPNet, are they using 5 scales of images for training (image pyramid)?

    • @Cogneethi
      @Cogneethi  4 ปีที่แล้ว

      Yes. You can use only 1 scale, but accuracy will be slightly less, but execution will be faster. They report best accuracy with 5 scales.
      See Table 9 in the SPPNet paper for more details.

    • @vinayprakash1687
      @vinayprakash1687 4 ปีที่แล้ว

      @@Cogneethi thank you. I watched your series. Very helpful. 👍

    • @Cogneethi
      @Cogneethi  4 ปีที่แล้ว

      @@vinayprakash1687 Thanks Vinay

  • @akshitsaini9709
    @akshitsaini9709 4 ปีที่แล้ว

    Sir, explanations was very good ,but I am looking for the simplest implementation of both fast r-CNN and faster r-CNN. I looked the GitHub's code but their codes are very complex, I just want to code both the model myself, so could you please provide me the code, or some reference from where I can get step by step implementation of the code.

    • @Cogneethi
      @Cogneethi  4 ปีที่แล้ว +1

      This might help: www.telesens.co/2018/03/11/object-detection-and-classification-using-r-cnns/#:~:text=In%20the%20current%20version%20(known,of%20a%20region%20containing%20a

    • @Cogneethi
      @Cogneethi  4 ปีที่แล้ว

      To be honest, even I couldnt find a simple implementation of this model.
      I too took a lot of time to understand this code.
      And I still dont know completely.
      But implementing any model from scratch helps to understand it better..
      If you find any useful resources, please let me know too.

    • @Amritanjali
      @Amritanjali 4 ปีที่แล้ว

      do you got any simple implementation

    • @Cogneethi
      @Cogneethi  4 ปีที่แล้ว

      @@Amritanjali had referred this for this tutorial github.com/endernewton/tf-faster-rcnn

    • @ahmadtalal184
      @ahmadtalal184 4 ปีที่แล้ว

      @@Cogneethi I was just wondering, who makes those complex implementations of these models, like what are their credentials and how we can match them.

  • @akashsuryawanshi6267
    @akashsuryawanshi6267 ปีที่แล้ว

    Amazing work

  • @mikegt2126
    @mikegt2126 4 ปีที่แล้ว

    Sir, the Region proposal based on original image and apply to the Feature map, But feature map itself is already in "transfor"format from original image( it can be flip, or half) how could we apply Region proposal for original image to feauture map can be correct?

    • @Cogneethi
      @Cogneethi  4 ปีที่แล้ว

      Sorry Mike, I could not understand your question.
      Did you mean 'tensor' format.

    • @mikegt2126
      @mikegt2126 4 ปีที่แล้ว

      @@Cogneethi First, we input original image say (800x800), then after passing VGG we get (50x50) Feature map, but then the bounding box of RPN is apply to the Feature map(50x50) .Since the feature map is a "transform" form of original image, how can The bounding box position in feature map can be reflected to the original image?

    • @Cogneethi
      @Cogneethi  4 ปีที่แล้ว

      @@mikegt2126 Hey, I have given an answer in the other comment. I think the question is same. Let me know otherwise.