219 - Understanding U-Net architecture and building it from scratch

แชร์
ฝัง
  • เผยแพร่เมื่อ 24 ธ.ค. 2024

ความคิดเห็น • 120

  • @deividrumiancev7356
    @deividrumiancev7356 11 หลายเดือนก่อน +5

    Great tutorial!! Way better to learn here than via my uni lectors and teachers!! Keep it up mate! You are the best!

  • @cvformedicalimages6466
    @cvformedicalimages6466 ปีที่แล้ว +7

    Thanks for the detailed explanation. This is the first time I am understanding how a Unet works! Thanks 🙂

  • @ToxikJumper
    @ToxikJumper หลายเดือนก่อน

    7 minutes in and already everything is clicking. Amazing explanation!

  • @Vikram-wx4hg
    @Vikram-wx4hg 2 ปีที่แล้ว +7

    Yes, really enjoyed it!
    Sreeni, you are a fantastic teacher and your tutorials bring out the concepts with reamarkable simplicity and clarity.

  • @javierordonez2445
    @javierordonez2445 19 วันที่ผ่านมา

    Great tutorial ! And great concise coding session. Packed a college lecture or two in a video !

  • @XX-vu5jo
    @XX-vu5jo 3 ปีที่แล้ว +11

    I would love to see a video on 3D U-Net from scratch as well. That will really help on understanding it better.

  • @kozhushko
    @kozhushko หลายเดือนก่อน

    Thank you! That's a great talent to explain so clearly.

  • @1global_warming1
    @1global_warming1 ปีที่แล้ว +1

    Thank you very much for such a clear explanation of how to build a U-net architecture from scratch

  • @drforest
    @drforest 11 หลายเดือนก่อน

    Thanks! If you had changed all the numbers of layers to say, 50, 100, 200, etc would that work, just with different designated layer numbers and whatever associated change in performance. Feels like that might have made the numbers a little easier to follow. But great work.

    • @DigitalSreeni
      @DigitalSreeni  11 หลายเดือนก่อน

      Thank you very much :)

  • @AmitChaudhary-qx5mc
    @AmitChaudhary-qx5mc 3 ปีที่แล้ว

    Sir i am very much greatful to your expanation on semantic segmentation.
    You make everything so easy and sublime.

  • @channagirijagadish1201
    @channagirijagadish1201 ปีที่แล้ว +1

    Excellent Tutorial. Much appreciated!

  • @abeldechenne6915
    @abeldechenne6915 6 หลายเดือนก่อน

    that was crystal clear, thank you for the good explanation!

  • @shivamchaurivar2794
    @shivamchaurivar2794 3 ปีที่แล้ว +2

    I really love your videos, Hope to make a video on stateful Lstm. Its very tough to find good video on it.

    • @nikhilmudgal8541
      @nikhilmudgal8541 3 ปีที่แล้ว +1

      Seems interesting. I hardly find any videos explaining Stateful LSTM myself

  • @rishabgangwar9901
    @rishabgangwar9901 3 ปีที่แล้ว +2

    Thank you so much sir for crystal clear explanation

  • @SeadoooRider
    @SeadoooRider 3 ปีที่แล้ว

    Your channel is gold. Thank you 🙏

  • @RRP3168
    @RRP3168 3 ปีที่แล้ว +3

    Great video, but I have a question: What if I want to segment my own images, how do I get the masks for training the UNET?

  • @madeleinedawson8539
    @madeleinedawson8539 ปีที่แล้ว

    Loved the video!!! So helpful

  • @rohit_mondal__
    @rohit_mondal__ 3 ปีที่แล้ว

    Your explanation is actually very good sir. Thank you. Happy to have subscribed to your channel .

  • @dyahtitisari7206
    @dyahtitisari7206 2 ปีที่แล้ว

    Thank you so much Sir. It's very great explanation

  • @IqraNosheen-ek3nk
    @IqraNosheen-ek3nk ปีที่แล้ว

    very good explanation, thanks for making video

  • @antonittaeileenpious8653
    @antonittaeileenpious8653 3 ปีที่แล้ว +2

    Sir,according to what i have understood in all the layers we are getting some features and applying maxpooling to actually reduce the features extracted and in the upsampling we increase the spatial dimensions,where do we actually classify the labelled pixels,and vary their weights,and apply a particular threshold to get to our desired ROI.

  • @edmald1978
    @edmald1978 3 ปีที่แล้ว

    Thank you very much for this video really amazing the way you explain. Thank you for your great Channel!!!!!!!!!!!!!!!!!!!!!!!!!!!!

  • @davefaulkner6302
    @davefaulkner6302 5 หลายเดือนก่อน

    Excellent video; thank you. One issue, however: shouldn't batch normalization follow ReLU rather than the reverse? If you ReLU a Batch Normalized layer, it's no longer normalized. Probably works, as these models are so flexible it would probably compensate in training. Or perhaps I'm confusing Batch Norm with Layer Norm?

  • @ericthomas4072
    @ericthomas4072 ปีที่แล้ว

    Very helpful! Thank you!

  • @msaoc22
    @msaoc22 ปีที่แล้ว

    Thank you for nice simple explanation

  • @geethaneya2452
    @geethaneya2452 3 ปีที่แล้ว

    I would like to see video on TransUNet. That will really help to understand its concept better.

  • @Алексей-ж4щ4м
    @Алексей-ж4щ4м 9 หลายเดือนก่อน

    Thanks a lot!
    But there's a question I cant't understand: why do we use padding="same" in a decoder block and have upsampling situations? I mean our shape is not the same, it become larger. Can sb help please?

  • @CD-et7vk
    @CD-et7vk 2 ปีที่แล้ว +1

    Doesn't the original U-Net use dissimilar padding so the image dimensions are lowered by 2 in each convolution per layer (572 -> 570 -> 568, etc.)?

  • @abderrahmaneherbadji5478
    @abderrahmaneherbadji5478 3 ปีที่แล้ว

    Great explanation

  • @apekshagopale7095
    @apekshagopale7095 ปีที่แล้ว

    Can you please tell how to create masks for SAR images?

  • @caiyu538
    @caiyu538 3 ปีที่แล้ว

    excellent lectures.

  • @AravTristy
    @AravTristy หลายเดือนก่อน

    Please share the playlist of this video

  • @dhaferalhajim
    @dhaferalhajim ปีที่แล้ว

    What's the number of classes in this structure? I saw one in the input and output

  • @davidyao2856
    @davidyao2856 8 หลายเดือนก่อน

    can this be applied to a dicom type dataset ?

  • @princekhunt1
    @princekhunt1 21 วันที่ผ่านมา

    Nice tutorial dini

  • @Luxcium
    @Luxcium 9 หลายเดือนก่อน

    21:09 I do prefer the functional programming approach… classes are useful to describe functors, monads, maybe and even some “eithers” 😏😏😏😏 this is way easier to understand for me but I don’t say FP is better than OOP or any such… 😅😅😅😅

  • @akshaybatra1777
    @akshaybatra1777 ปีที่แล้ว

    Does unet only works with 3 channels? I have breast mammography in dicom format, they have 1 channel (grayscale). Can I still use uNET?

    • @DigitalSreeni
      @DigitalSreeni  ปีที่แล้ว

      You can use it for any number of input channels.

    • @akshaybatra1777
      @akshaybatra1777 ปีที่แล้ว

      what about the image size? My images are 4000x3000. is it possible to use unet on them>

  • @deepak_george
    @deepak_george 3 ปีที่แล้ว

    Good work @digitalsreeni ! Which tool do use to view image mask? Since in normal image viewer it shows all black.

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว +1

      Use imageJ.

    • @deepak_george
      @deepak_george 3 ปีที่แล้ว

      @@DigitalSreeni Where is the option in ImageJ to configure to see the mask? Couldn't find the video in which you mentioned this.

  • @talha_anwar
    @talha_anwar 3 ปีที่แล้ว

    The decoder part should be same as encoder, but in the reverse direction. but when we concatenated, how thins thing maintained ?

  • @hadyanpratama
    @hadyanpratama 3 ปีที่แล้ว

    Thank you, very clear explanation

  • @pallavi_4488
    @pallavi_4488 3 ปีที่แล้ว

    doing an amazing job

  • @anorderedhole2197
    @anorderedhole2197 2 ปีที่แล้ว

    I tried making images with very narrow masks with a line a pixel in thickness. I noticed that when I resize the images the line will get broken up. Does this become more severe when the image is down sampled in the Unet model? Do you need the mask to have a very broad pixel widths to be useful?

  • @anshulbisht4130
    @anshulbisht4130 2 ปีที่แล้ว

    loved ur code. i knew unet architecture but when u showed it with running code n images , it was awesome . will reimplement with some other data and try to see if it works. just one confusion what is ground truth when we are applying adam how loss is getting calculating for backprop to work.

  • @jetsdiver
    @jetsdiver 2 ปีที่แล้ว

    For segmentation, for example, to detect things like, flood or fire, or smoke or clouds. Better to use grayscale or colored images?

  • @effeff3253
    @effeff3253 2 ปีที่แล้ว

    Can you please explain these two doubts:
    1) Why is the number of feature maps has been reduced to half in each layer of the expansion phase?
    2) Say for the 1st layer of the expansion phase, the input is 16x16 with 1024 feature maps then how does it become 32x32 with 512 feature maps after applying a simple up-convolution of 2x2. I mean up-convolution is simply copying the data into a larger block so number of feature maps should have nothing to do with this copying and be same as 1024 only.When doing up-conv 2x2, which 512 feature maps have been taken out from 1024 feature maps?

    • @DigitalSreeni
      @DigitalSreeni  2 ปีที่แล้ว

      The number of feature maps has nothing to do with the convolution kernel. The number of feature maps is defined by you, as part of your model. If you define your Conv. as - Conv2D(512, (2, 2), strides=2), you are defining the number of feature maps as 512 and kernel size for the convolution operation as 2x2 and stride as 2. This means your output would have 512 feature maps and the output image dimensions would be whatever you get with a 2x2 kernel and stride 2. Most people have a misunderstanding about this concept and I am glad you asked.

    • @effeff3253
      @effeff3253 2 ปีที่แล้ว

      @@DigitalSreeni Thanks for replying but my doubts still remain. For e.g. in the first layer of contraction phase, the output is 64 images of 256x256. When it is subjected to max pooling, the size of each image tile is reduced to half i.e. we now have 64 images of size 128x128. Now in the 2nd layer, I have 128 filters. Are these 128 filters applied to each of the 64 images of 128x128? If it is, for each of the 64 images of size 128x128, I have 128 output images. i.e. I have a total of 64x128 images of size 128x128. which keeps on growing after each convolution operation.

  • @anikashrivastava8228
    @anikashrivastava8228 11 หลายเดือนก่อน

    Sir, can we seperate a u-net, in sense that can we train a u-net and then save weights of encoder, decoder bottleneck separately also, and then use it separately? will we get same reconstruction of a test dataset if we do it by u-net (entire architecture) and when we do it by feeding it to enocder, then bottleneck then decoder? Please help.

  • @orioncloud4573
    @orioncloud4573 ปีที่แล้ว

    thx for the clear application.

  • @arshadgeo8829
    @arshadgeo8829 2 ปีที่แล้ว

    Hello Sreeni, I wanted a favor that I would like to see the complete implementation of Segnet for satellite imagery and should have idea for Segnet+Resnet (using or without transfer learning). Can you help me out?

  • @pycad
    @pycad 3 ปีที่แล้ว

    Thank you for this great explanation

  • @random-yu5hv
    @random-yu5hv 3 ปีที่แล้ว

    I really appreciate your videos. Will you check segAN network in medical image segmentation? Best regards.

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว +1

      GANs are generative networks so I am reluctant to use them for segmentation. Besides, U-nets do a great job so I haven’t found a reason to find an alternative.

  • @computingyolo5545
    @computingyolo5545 3 ปีที่แล้ว

    There is one aspect that is blocking me, at the line #12,
    small_dataset_for_training/images/12_training_mito_images.tif
    small_dataset_for_training/masks/12_training_mito_masks.tif
    it's not specified in this lesson, whether the large image and large mask stacks have to be left undefined as address. In other words, how could I address folders with many pictures and masks to be picked up? A simple example, please? Brilliant explanation, Doctor, long life to you!

  • @olubukolaishola4840
    @olubukolaishola4840 3 ปีที่แล้ว +1

    👏🏾👏🏾👏🏾👏🏾👏🏾👏🏾

  • @tarasankarbanerjee
    @tarasankarbanerjee 2 ปีที่แล้ว

    Dear Sreeni, thanks a lot for this awesome video. Just one question, shouldn't the 'decoder_block' call the 'conv_block' twice?

    • @tahaben-abbou7029
      @tahaben-abbou7029 2 ปีที่แล้ว

      No actually the encoder block has already two convs layers the Decoder should call it one time not two. Thank you

    • @tarasankarbanerjee
      @tarasankarbanerjee 2 ปีที่แล้ว

      @@tahaben-abbou7029 Thanks Taha for your comments. But if you look at the UNet architecture, the Decoder block also has 2 conv layers; just like the Encoder block. Hence the question.

  • @fatmagulkurt2080
    @fatmagulkurt2080 3 ปีที่แล้ว +2

    Thank you for your effort to teach. I really appreciate your videos. I learning so much about coding. But I couldn't find any code anywhere for classifying multiclass images with DenseNet201. And also how can I do 5 fold - validation when runing theese deep learning codes. I wish you can help me. It will be so helpfull for me.

  • @lucasdiazmiguez8680
    @lucasdiazmiguez8680 2 ปีที่แล้ว

    Hi! Very nice video, just a question, do u have the link to the original paper?

  • @torikulislam23
    @torikulislam23 3 ปีที่แล้ว

    Well thank u ,it was really obliging ❤️

  • @liangyou03
    @liangyou03 26 วันที่ผ่านมา

    Amazing

  • @rajithakv4449
    @rajithakv4449 3 ปีที่แล้ว

    Sir I have used the unet model for segmentation of filamentous structures. Though it give a good prediction, the predictions are wider than the groung truth. What could be the reason for this. Also the IOU value is around 0.33. I have also added drop out with 0.5.

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว

      Try increasing threshold values for your filamentous class; I assume the probability around the wider regions are lower. If that is not the case then please verify your labels, may be they are also exaggerated? If not, check whether you are working on images of similar size showing features in a similar dimensions. Finally, try 3D U-Net as the prediction can benefit from additional information from the 3rd dimension.

  • @XX-vu5jo
    @XX-vu5jo 3 ปีที่แล้ว

    Are you familiar with the attention module? Is it possible to implement such with u net? Would love to watch a video about it.

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว

      It is coming soon, please stay tuned.

    • @XX-vu5jo
      @XX-vu5jo 3 ปีที่แล้ว +1

      @@DigitalSreeni i am always tuned in woah! Thanks

  • @CRTagadiya
    @CRTagadiya 3 ปีที่แล้ว

    Could you please add this video under your image segmentation playlist?

  • @jyothir07
    @jyothir07 3 ปีที่แล้ว

    Sir, Recently joined as your student. Couldn't thank you enough for this teaching. Could you please explain how to create and use a custom dataloader for large datasets?

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว +1

      I plan on recording a video soon but not sure when it is going to happen. Until then you may find this useful: th-cam.com/video/VNGRlf6ZlQA/w-d-xo.html

  • @nayamascariah776
    @nayamascariah776 3 ปีที่แล้ว

    your videos are really amazing.. I am really thankful for your efforts.. sir I have one doubt.. if I want to add dice coefficient as a loss function.. how can I add..??

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว +1

      Please check my video 215 for an answer. I also covered it as part of videos 210, 211, and 214. But I wrote my own few lines for dice coefficient in video 215, so you may find it useful.

  • @rezatabrizi4390
    @rezatabrizi4390 3 ปีที่แล้ว

    thank you so much

  • @amintaleghani2110
    @amintaleghani2110 3 ปีที่แล้ว

    @DigitalSreeni , thank you for your effort making this informative video. I wonder if we can use ResNet for Time Series data prediction. If so, Could you pls make video on the subject. Thanks again

  • @antonittaeileenpious8653
    @antonittaeileenpious8653 3 ปีที่แล้ว

    Sir,is the last layer a FCN layer.

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว

      U-net is a fully convolutional network, so there are no FCN layers.

  • @rishabgangwar9901
    @rishabgangwar9901 3 ปีที่แล้ว

    I wanted to know more about .tif format

  • @cutedevil173
    @cutedevil173 3 ปีที่แล้ว

    Hi, its really interseting and educational. It would be really helpful if you train Unet on Automated Cardiac Diagnosis Challenge (ACDC) using .nifty kind of dataset

  • @ARCGISPROMASTERCLASS
    @ARCGISPROMASTERCLASS 2 ปีที่แล้ว

    Excellent happy to subscribe your channel

  • @guitar300k
    @guitar300k 3 ปีที่แล้ว

    Is u-net the best for image segmentation?

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว

      It is the most widely used framework for image segmentation where a lot of papers have been published. So we know it works.

  • @lemondragon8184
    @lemondragon8184 10 หลายเดือนก่อน

    awesome

  • @nandankakadiya1494
    @nandankakadiya1494 3 ปีที่แล้ว

    Thank you for great explanation sir. Code is not available in GitHub. It would be great if you upload this.

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว +1

      It will be there soon... usually 6 to 8 hr. delay as I need to upload manually.

    • @nandankakadiya1494
      @nandankakadiya1494 3 ปีที่แล้ว

      @@DigitalSreeni ok thanks for the great tutorial

  • @biplugins9312
    @biplugins9312 3 ปีที่แล้ว

    My only choice is to run your software on Colab. It uses the latest tensorflow and I had no desire to drop back to version 1.x.
    To correct an error, I had to change the directory structure on keras.utils and instead of trying to import from
    unet_model_with_functions_of_blocks, I did a %run on the program from inside colab. The changes are
    !pip install patchify
    %run '/content/drive/My Drive/Colab Notebooks/unet_model_with_functions_of_blocks.py'
    #from unet_model_with_functions_of_blocks import build_unet
    from keras.utils.np_utils import normalize
    I don't know why but on colab it seems to be running about 1/2 the speed you are seeing in spyder.
    Epoch 25/25
    40/40 [==============================] - 58s 1s/step - loss: 0.0383 - accuracy: 0.9853 - val_loss: 0.1793 - val_accuracy: 0.9589
    It complained that "lr" and "fit_generator" were deprecated so I fixed them to:
    model.compile(optimizer=Adam(learning_rate = 1e-3), loss='binary_crossentropy', metrics=['accuracy'])
    history = model.fit(my_generator, validation_data=validation_datagen,
    but it didn't help. In any case, it does work in colab, with the latest tensorflow.

  • @xichen7867
    @xichen7867 2 ปีที่แล้ว

    Hello teacher! Can you add a Chinese subtitle or offer a course on a Chinese video site, your courses are of very high quality! Thank you!

  • @sorasora3611
    @sorasora3611 2 ปีที่แล้ว

    How write u_net is algorithem step?

  • @jithinnetticadan4958
    @jithinnetticadan4958 3 ปีที่แล้ว

    Will this work for 256*256 rgb images or should I increase the layers and start from 32/16?

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว

      U-net is a framework where you convert an autoencoder architecture into U-net by adding skip connections. There is no right or wrong and the network can be customized for your specific application. The example I provided will work for 256x256 RGB images, you just need to define the number of channels as 3.

    • @jithinnetticadan4958
      @jithinnetticadan4958 3 ปีที่แล้ว

      Thanks for the reply..
      I tried using the same but my single epoch takes upto 30 mins to complete. (without gpu) Is it normal?

    • @DigitalSreeni
      @DigitalSreeni  3 ปีที่แล้ว

      Depends on the amount of data. It will be painfully slow without GPU. Try using Google colab where you get a free GPU.

    • @jithinnetticadan4958
      @jithinnetticadan4958 3 ปีที่แล้ว

      Thanks a lot. Actually my dataset contains 7200 images including the masks so its impossible to make use of google colab, only option is to reduce the size of my dataset.

    • @jithinnetticadan4958
      @jithinnetticadan4958 3 ปีที่แล้ว

      Also sir in your video you had mentioned about increasing the layers so I tried increasing the layers by 2 (16,32) but the number of parameters remains the same. What could be the reason?

  • @himanimogra6824
    @himanimogra6824 ปีที่แล้ว

    Can we pas an input size of 224 * 224 to U-Net?

    • @himanimogra6824
      @himanimogra6824 ปีที่แล้ว

      224*224*1

    • @DigitalSreeni
      @DigitalSreeni  ปีที่แล้ว +1

      Yes. You can pass any image size - U-Net is fully convolutional.

    • @himanimogra6824
      @himanimogra6824 ปีที่แล้ว

      @@DigitalSreeni Thank You for the reply sir.
      I have one more doubt when I am training my model my kernel is getting dead again and again at the start of 1st epoch itself. What should I do? I have resized my images in 224*224*224 dimension

  • @mattmorgs229
    @mattmorgs229 3 หลายเดือนก่อน

    Laser eye crab-bot? pretty good i think

  • @mithgaur7419
    @mithgaur7419 3 ปีที่แล้ว

    I came looking for copper and I found gold, it would've saved me a lot of time if I found this channel earlier thnx for the awesome content. I'm currently working on a U-Net project using google colab and I can't figure out how to define a distribution strategy for tpu. What is the correct way to do it on this code?

  • @sahartaheri1032
    @sahartaheri1032 3 ปีที่แล้ว

    great thanks

  • @talha_anwar
    @talha_anwar 3 ปีที่แล้ว

    best

  • @HafeezUllah
    @HafeezUllah 3 ปีที่แล้ว

    Thank you for this great explanation