220 - What is the best loss function for semantic segmentation?

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 พ.ย. 2024

ความคิดเห็น • 81

  • @Avalaxy
    @Avalaxy 3 ปีที่แล้ว +11

    Probably the best explanation I have ever seen on this topic. I love the simple clear code samples, clear visual examples and the good explanation. Thank you!

  • @rajpulapakura001
    @rajpulapakura001 9 หลายเดือนก่อน +1

    Best explanation for segmentation loss ever!

  • @maryamomar4106
    @maryamomar4106 2 ปีที่แล้ว +2

    I cloud say thank you a hundred times, and that wouldn't be enough. Thank you! you made my life a breeze.

    • @DigitalSreeni
      @DigitalSreeni  2 ปีที่แล้ว

      Glad I could help! Keep watching :)

  • @rbhambriiit
    @rbhambriiit 11 หลายเดือนก่อน

    Nice lecture. One suggestion: You should add the links to the research papers referred to in the description.

  • @pietheijn-vo1gt
    @pietheijn-vo1gt 2 ปีที่แล้ว +2

    7:15 I want to add something here. I have had very poor performance on datasets with class imbalance due to this +1 term. I think it's due to the fact that it works kind of like a 'smoothing' factor for your loss function. Only when I dropped this number to something like 0.1 or even lower could I get any representation of the smallest classes in my predictor, it was a huge revelation when I figured this out after a few days of headaches.

    • @DigitalSreeni
      @DigitalSreeni  2 ปีที่แล้ว +3

      For class imbalance data, you may find focal loss to do a better job.

    • @AnwerShahzaib
      @AnwerShahzaib ปีที่แล้ว

      @@DigitalSreeni In one of your playlist where you used Brats2020 Dataset for sementic segmentation for the multiple classes. I have been on a project that's specific to only (t1ce and and it's mask[mask==2] = 1) , basically a binay segmentaion model to detect tumor core. The class imbalance problem persists after experimenting with BinaryFocalLoss, total loss (dice_loss + (1 * focal_loss)). The model overfit for the class 0 (background) and overfit in the initial phase of traning

    • @nuttapatchaovanapricha6844
      @nuttapatchaovanapricha6844 8 หลายเดือนก่อน

      @@DigitalSreenias z as cvv f see a

  • @caiyu538
    @caiyu538 2 ปีที่แล้ว

    I understand the meaning of focal loss, what it is purpose and the library you mentioned. it looks that we do not need to learn the details how to implement it (add more weight to less probability classification ). I think most of them are like black box. We only need what it is for and input and output. Thank

  • @krishkr1809
    @krishkr1809 3 หลายเดือนก่อน

    The best explanation !

  • @전주석-m5f
    @전주석-m5f หลายเดือนก่อน

    In image classification tasks, why not use focal loss instead of cross entropy for soft label data augmentation techniques such as cutmix or mixup?
    Why not use the sum of the focal loss for each class as follows:
    'sum_i{alpha_i*y_i (1 -p_i)**gamma*log(p_i)}'?

  • @briandavis9296
    @briandavis9296 3 ปีที่แล้ว

    Your code example may be oversimplifying the IoU loss. You use `x.sum()` which is generally going to be summing over the batch dimension. This gives a different result than the "proper" way of summing only in the spatial dimensions and then averaging/summing across the batch after the IoU is computed.

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว +1

      Yes, this is a simplification but works fine as a loss function that the optimizer is minimizing. You may want to get down to pixel granularity to compute IoU metrics but for loss function this works ok. But I do not recommend using IoU loss as there are much better approaches, for example focal loss.

  • @bijoyalala5685
    @bijoyalala5685 3 ปีที่แล้ว +1

    Hello Sreeni Sir, thank you for your informative video about focal loss explanation. In my sementic segmentation model I have using
    dice_loss = sm.losses.DiceLoss()
    focal_loss = sm.losses.BinaryFocalLoss()
    total_loss = (1 * focal_loss) + dice_loss and Adam as optimizer with 0.0001 learning rate and momentum=0.9 . The loss is decreasing and keep aproaching towards negative value. After 100 epochs, training loss shows (-51.3400). Is the negative loss value is incorrect? Let me know some solution when loss is going to negative value.

  • @marcospaulobatista2375
    @marcospaulobatista2375 3 ปีที่แล้ว

    Excellent video! Have you seen focal loss with GAN pix2pix segmentation?

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว +1

      No, I only saw binary cross entropy.

  • @utei9502
    @utei9502 3 ปีที่แล้ว +1

    Hi Sir, thank you so much for a very clean and easy-to-understand explanation!
    I wonder, for multi-class semantic segmentation, do you have results that compare segmentation networks' performance on actual data using CE and FL? Does FL help improving the overall accuracy as well as the speed of convergence? Actually, for accuracy assessment, is average IoU a good metric, or is there a better one, especially for data with imbalanced samples?

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว

      It would be a good test to see how fast the model converges with CE vs FL and the effect of each loss function on the overall and individual IoU values. This work has already been done by the original authors of the paper so I try not to repeat much work. I will trust their observations.
      I find IoU to be the best metric to evaluate semantic segmentation. I look at mean IoU during training but to evaluate, I always look at IoU for each class. That is the only way to find out the effectiveness of model at segmenting various classes.

  • @arnabmishra827
    @arnabmishra827 2 ปีที่แล้ว

    What is the relative merits and demerits of using IoU vs. BCE loss for semantic segmentation Sir? Recently many of the research papers have considered using pixel-wise BCE loss as the primary loss functions for semantic segmentation / salient object detection tasks. Can you please explain.

  • @alcasla90
    @alcasla90 3 ปีที่แล้ว

    Great work and great explanation. Thanks

  • @bikcrum
    @bikcrum ปีที่แล้ว

    Great explanation!

  • @onlyjimmy4ever391
    @onlyjimmy4ever391 2 ปีที่แล้ว

    LOVE UR CLASS SO MUCH SIR!

  • @ExV6120
    @ExV6120 3 ปีที่แล้ว +1

    How about Focal loss vs Dice loss?

  • @rezatabrizi4390
    @rezatabrizi4390 3 ปีที่แล้ว

    thank you, I have a question about the Lovasz loss function for binary segmentation, can we have a video about the application of this loss function, please.?
    thank you so much

  • @tapabrat_thakuria
    @tapabrat_thakuria 4 หลายเดือนก่อน

    Sir in case of Smartphone base oral lesion segmentation which loss function is important?

  • @rezatabrizi4390
    @rezatabrizi4390 3 ปีที่แล้ว

    thank you, I have a question about the lovasz loss function for binary segmentation, we can have a video about the application of this loss function.
    ? thank you so much

  • @tonihullzer1611
    @tonihullzer1611 2 ปีที่แล้ว

    Is there a reason you do not threshold your Y pred values for the metric? Because at the end of the day, they have to be binarized in order to get a mask.

  • @emanalajrami1919
    @emanalajrami1919 3 ปีที่แล้ว +1

    Hi Sir, thank you for this amazing explanation. I am using Unet for segmentation. how can I add a dropout to it? and I use Hausroff Dustance as a mertic but it did not improve that much even when I increase the size of the dataset for training? Di you have any idea to help? Thanks

  • @lazy.researcher
    @lazy.researcher 2 ปีที่แล้ว

    you are genious

  • @krocodilnaohote1412
    @krocodilnaohote1412 3 ปีที่แล้ว

    Great stuff, thank you!

  • @learnmore3647
    @learnmore3647 2 ปีที่แล้ว

    Hi ,
    Please, i want to know what is the correct implementation of the dice coefficient?
    This code :
    def dice_coef1(y_true, y_pred, smooth=1):
    intersection = K.sum(y_true * y_pred, axis=[1,2,3])
    union = K.sum(y_true, axis=[1,2,3]) + K.sum(y_pred, axis=[1,2,3])
    dice = K.mean((2. * intersection + smooth)/(union + smooth), axis=0)
    return dice
    gives me 0.82, while this code :
    def dice_coef2(target, prediction, smooth=1):
    numerator = 2.0 * K.sum(target * prediction) + smooth
    denominator = K.sum(target) + K.sum(prediction) + smooth
    coef = numerator / denominator
    return coef
    gives me 0.94.
    Thank you.

  • @davidvc4560
    @davidvc4560 2 ปีที่แล้ว

    excellent!

  • @nagavenik4862
    @nagavenik4862 ปีที่แล้ว

    Sir what is main Loss , attention loss, inter class loss

  • @gactve2110
    @gactve2110 3 ปีที่แล้ว

    Great video! clear and on point!
    You have a new sub :)
    Thank you

  • @florianhofstetter6859
    @florianhofstetter6859 3 ปีที่แล้ว

    Is there a loss function that takes into account the perspective of an image? For example, when segmenting lanes, pixels that are further on the horizon should be weighted higher because they cover a larger area of the road.

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว +1

      As you know, an image is just a bunch of numbers and deep learning algorithms are application agnostic. This means, they do not understand perspective as we see it. There must be some analogous data that reflects the perspective. If so, we can create a custom loss function that reflects the perspective in the image. In your example, focal loss may be a good function as it is designed to focus more on wrongly classified data.

    • @florianhofstetter6859
      @florianhofstetter6859 3 ปีที่แล้ว

      @@DigitalSreeni Yes, I have information of the perspective. I could calculate the homography Matirx. Because it was always taken with the same camera, it does not change over the data set. I thought focal loss would be different for different classes. But my problem is about binary classification. - Lane or not lane on the road.

  • @ryoungseobkwon9660
    @ryoungseobkwon9660 3 ปีที่แล้ว

    Thank you!

  • @diegozegarra4973
    @diegozegarra4973 3 ปีที่แล้ว

    Thank you so much!

  • @gunjannaik7575
    @gunjannaik7575 2 ปีที่แล้ว

    Why some people write loss as (1- coefficient value) and some as -coefficient value?

    • @DigitalSreeni
      @DigitalSreeni  2 ปีที่แล้ว

      If the coefficient value is a number smaller than 1 but still positive (e.g., 0.005, 0.1, 0.5, 0.9, etc.) then (1-coeff.) makes sense. If the coeff is a negative value (e.g., -5, -4, -3, etc.) then (-coefff) makes sense. The optimizer's job is to minimize this loss function.

  • @권령섭학생협동과정조
    @권령섭학생협동과정조 3 ปีที่แล้ว

    Thank you for nice video. I have a simple question, for training UNET architecture(Semantic segmentation), do we have to prepare same size Images? Or is it okay to train with diverse size of images?? (I am using CVAT for making training images) Thank you so much :)

    • @utei9502
      @utei9502 3 ปีที่แล้ว

      For training, you'll need to crop the larger images to the same size as the smaller images, so that they can be concatenated into a 4D array and fed into the network.
      You may also want to crop images into smaller tiles, if you have limited GPU memory. If properly done, tiling also improves training performance.
      For inferencing though, you can feed images of different sizes into the network, one at a time.

  • @mingjuhe773
    @mingjuhe773 2 ปีที่แล้ว

    Thanks bro!

  • @umeshpathak825
    @umeshpathak825 3 ปีที่แล้ว

    thank you so much sir

  • @reemawangkheirakpam8165
    @reemawangkheirakpam8165 3 ปีที่แล้ว

    thank you sir

  • @talha_anwar
    @talha_anwar 3 ปีที่แล้ว

    please cover other loss functions also

  • @valeras9416
    @valeras9416 2 ปีที่แล้ว

    Thanks for awesome explanation. You saved my day!

  • @rs9130
    @rs9130 3 ปีที่แล้ว

    Hello Sreeni,
    Did you face problem of low validation score but high training score while using cross entropy loss, or any otherloss. Why does this happen? I even tried shuffling my data. I am using fcn model for multi class segmentation

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว +1

      If you are getting low validation score and high training score then your model is overfitting. I don't think it has to do with the loss function, it could be but you need to check other factors first. See if you can simplify the model (not too deep), use augmentation to generalize it, try early stopping, add dropout layers, etc.

  • @PauloZiemer
    @PauloZiemer 3 ปีที่แล้ว

    Thanks

  • @muhammadroshan7315
    @muhammadroshan7315 2 ปีที่แล้ว

    Is focal loss best for binary example too?

    • @DigitalSreeni
      @DigitalSreeni  2 ปีที่แล้ว

      Yes. Of course, depends on the problem itself but I have no reason to question it for binary.

  • @jithinnetticadan4958
    @jithinnetticadan4958 3 ปีที่แล้ว

    Do you know why the validation accuracy is higher than training initially (a gap of 15% approx) but after reaching 65% the validation accuracy starts decreasing though training improves?

    • @chiragchauhan8429
      @chiragchauhan8429 3 ปีที่แล้ว

      I would suggest you to use a smaller network or try decreasing your learning rate to like 0.00001/0.0008 something.

    • @jithinnetticadan4958
      @jithinnetticadan4958 3 ปีที่แล้ว

      @@chiragchauhan8429 learning rate is set to 1e-6. If i increase it these values keep increasing and decreasing abruptly. By using smaller network do you mean I should start with 8 and go till 256(ie the bottleneck layer)?

    • @chiragchauhan8429
      @chiragchauhan8429 3 ปีที่แล้ว

      @@jithinnetticadan4958 By smaller network I mean decrease the number of layers. Keep the network simple and try experimenting with max pooling and average pooling. How many no. of layers you are using?

    • @jithinnetticadan4958
      @jithinnetticadan4958 3 ปีที่แล้ว

      Right Now 5 layers (including encoder and bottleneck) starting from 16 to 256 and learning rate set to 1e-6. Also will give a try using average pooling. 👍

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว +1

      I recommend changing network architecture only if you see an overfitting problem or other issues at the end of training. The first few epochs of training may look weird but nothing to worry. Validation accuracy may be higher than training accuracy if the model is better representing validation data, this happens with small datasets or sometimes when validation data is indeed easy to segment. Try changing the random seed to split train and validation, this gives new validation data that may not be as easy to segment.

  • @frankmunoz2437
    @frankmunoz2437 4 หลายเดือนก่อน

    🎯 Key points for quick navigation:
    01:40 *🛠️ Understanding the U-Net architecture is crucial before building it from scratch in Python.*
    02:21 *🌐 Image dimensions in U-Net can be adjusted to simplify understanding, such as using 256x256 input sizes.*
    03:58 *🖼️ The U-Net model consists of repeated patterns of convolutions, max pooling, and upsampling layers.*
    06:47 *🔗 Concatenation of features in U-Net aids in semantic segmentation by combining context from different scales.*
    08:51 *🔄 Skip connections in U-Net allow features to be combined across network levels for improved semantic segmentation.*
    12:49 *🧱 Defining functions for convolution and encoder blocks in U-Net helps streamline the model construction process.*
    16:30 *🔄 Decoder blocks in U-Net consist of upconvolution, concatenation, and convolution operations.*
    19:29 *🧩 Building the U-Net model involves defining encoder blocks, base block, and decoder blocks systematically.*
    22:58 *🛠️ Focus on the decoder block with inputs from the previous block.*
    23:37 *🖥️ Output layer in U-Net is a convolution with one output using sigmoid activation.*
    24:32 *🧱 U-Net architecture divided into convolution, encoder, and decoder blocks for image segmentation.*
    28:02 *🚀 Model building process includes defining encoder, decoder blocks, and building the U-Net architecture.*
    33:12 *🔄 Use data augmentation to improve results in image segmentation tasks.*
    34:24 *📊 After training, check validation metrics like accuracy and IOU score for model performance evaluation.*
    Made with HARPA AI

  • @salmahayani2710
    @salmahayani2710 3 ปีที่แล้ว

    Hello firstly thankx for this useful video , i wanna ask something about dice coef loss , i'm doing a semantic segmentation on 3D Ct scan (Luna16 database )using 3D unet i have a problem that my dice loss function blocked in 50% and don't decrease anymore for both training and validation, do you have any idea what could be the problem?
    Waiting for your answer :)

    • @petercappetto928
      @petercappetto928 3 ปีที่แล้ว

      Have you normalized your data? I am working with Luna 16 and a 3D U-Net, as well. I forgot to normalize the data and experienced a very high validation loss. Once I normalized the data, the network performance improved drastically.

    • @salmahayani2710
      @salmahayani2710 3 ปีที่แล้ว

      @@petercappetto928 yeahh of corse i did it (what i under from normalized is that all pixels are between 0and 1 or to transform
      images to binary for 0 or 1??) the problem is that my loss error when arrive to 0.50 don't get down anymore, i'm really blocked don't know if the problem with data or with my network ,i will be so grateful if u show me how to deal with this .

    • @talha_anwar
      @talha_anwar 3 ปีที่แล้ว

      @@petercappetto928 it is not mandatory to loss function go below 0.5, it's not something between 0 and 1

  • @nouhamejri1698
    @nouhamejri1698 3 ปีที่แล้ว

    im using focal+dice loss for multiclass semantic segmentation im getting good results but loss is always 0.7 and doesn't decrease and if I use cross_entropy loss I got 0.01 loss what does this means?

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว +1

      Please try a different optimizer or change the learning rate and also try changing the way you initialize the weights.

  • @ashwiniyadav464
    @ashwiniyadav464 3 ปีที่แล้ว

    Sir what is the best loss function for classification of x rays

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว

      Loss functions do not care about the application; it can be x-rays or CT or satellite images. I recommend focal loss for semantic segmentation.

    • @ashwiniyadav464
      @ashwiniyadav464 3 ปีที่แล้ว

      Sir please suggest me latest and best denoising techniques for denoising medical images

  • @러블리민희
    @러블리민희 3 ปีที่แล้ว

    Good!

  • @moazeldefrawy4379
    @moazeldefrawy4379 3 ปีที่แล้ว

    Thank You!