Computer Vision with Hüseyin Özdemir
Computer Vision with Hüseyin Özdemir
  • 77
  • 105 582
Self-Attention
This video describes details of Scaled Dot-Product Attention, specific Self-Attention version used inside transformer architecture
In this video, animations and images except the ones taken from reference papers belong to me
References
Attention Is All You Need
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit,
Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
arxiv.org/abs/1706.03762
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn,
Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer,
Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby
arxiv.org/abs/2010.11929
#machinelearning #computervision
#deeplearning #ai #aitutorial #education
#transformer #visiontransformer #vit
#selfattention #multiheadattention
#imageprocessing #datascience
#computervisionwithhuseyinozdemir
มุมมอง: 439

วีดีโอ

Multi-Head Attention
มุมมอง 1406 หลายเดือนก่อน
First, Self-Attention, building block of Multi-Head Attention, is defined. Then, Multi-Head Attention is described in detail Video Contents: 00:00 Self-Attention 07:55 Multi-Head Attention In this video, animations and images except the ones taken from reference papers belong to me References Attention Is All You Need Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aida...
Comparison of CNN and ViT
มุมมอง 2716 หลายเดือนก่อน
Inductive bias is defined, CNN and ViT architectures are compared All animations and images in this video belong to me References Attention Is All You Need Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin arxiv.org/abs/1706.03762 An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Alexey Dosovitskiy, ...
Vision Transformer
มุมมอง 6886 หลายเดือนก่อน
After the success in NLP, transformer architecture is adapted for image recognition as Vision Transformer (ViT) Video Contents: 00:00 Introduction 02:20 Extracting Embedding Vectors 05:13 Self-Attention 12:58 Multi-Head Attention 15:46 MLP 16:28 Classification Head 17:36 Comparison of CNN and ViT In this video, animations and images except the ones taken from reference papers belong to me Refer...
Diffusion Models Explained with Math From Scratch
มุมมอง 2.9K8 หลายเดือนก่อน
Diffusion Model is a popular Generative AI method. Stable Diffusion and OpenAI Sora are diffusion models where diffusion takes place in latent space instead of image pixel space. Video Contents: 00:38 Sampling from a Standard Gaussian Distribution 02:30 Forward Process 04:58 Noise Addition in Single Step 06:11 Variance Schedule 07:36 Reverse Process 08:32 Derivation of Variational Lower Bound 1...
Greetings
มุมมอง 754ปีที่แล้ว
Hi, my name is Hüseyin Özdemir Welcome to my channel! This channel is about Computer Vision, Deep Learning, Machine Learning and Artificial Intelligence. In each video, ideas are described step by step in full detail. If you find the videos useful, like, subscribe, share and comment.
Supervised Learning
มุมมอง 87ปีที่แล้ว
Subscribe To My Channel www.youtube.com/@huseyin_ozdemir?sub_confirmation=1 Video Contents: 00:00 Definition of Labeled Dataset 00:53 Supervised Learning Mechanism 03:01 Subcategories of Supervised Learning 03:44 Supervised vs. Unsupervised * Definition of Labeled Dataset * Illustration of Supervised Learning Mechanism * Subcategories of Supervised Learning * Comparison of Supervised and Unsupe...
Unsupervised Learning
มุมมอง 73ปีที่แล้ว
Subscribe To My Channel www.youtube.com/@huseyin_ozdemir?sub_confirmation=1 Video Contents: 00:00 Illustration of Labeled and Unlabeled Datasets 01:27 Unsupervised Learning 03:05 Applications of Unsupervised Learning 04:12 Supervised vs. Unsupervised * Illustration of Labeled and Unlabeled Datasets * Unsupervised Learning * Applications of Unsupervised Learning * Comparison of Supervised and Un...
Semi-Supervised Learning
มุมมอง 104ปีที่แล้ว
Subscribe To My Channel www.youtube.com/@huseyin_ozdemir?sub_confirmation=1 Video Contents: 00:00 Comparison of Supervised and Unsupervised Learning 01:08 Semi-Supervised Learning 01:45 Self-Training, a Semi-Supervised Learning example * Comparison of Supervised and Unsupervised Learning * Semi-Supervised Learning * Self-Training, a Semi-Supervised Learning example All images and animations in ...
Self-Training
มุมมอง 382ปีที่แล้ว
Subscribe To My Channel www.youtube.com/@huseyin_ozdemir?sub_confirmation=1 In this video, Self-Training, a Semi-Supervised Learning method is described in detail All images and animations in this video belong to me #machinelearning #computervision #deeplearning #ai #aitutorial #education #semisupervisedlearning #unlabeleddata #pseudolabel #selftraining #labeleddata #imageprocessing #datascienc...
Self-Supervised Learning
มุมมอง 226ปีที่แล้ว
Subscribe To My Channel www.youtube.com/@huseyin_ozdemir?sub_confirmation=1 Video Contents: 00:00 Comparison of Supervised and Unsupervised Learning 01:09 Self-Supervised Learning 01:50 Self-Supervised Learning example with Autoencoder 04:16 Pretext and Downstream Tasks 06:32 Different Types of Pretext Tasks * Comparison of Supervised and Unsupervised Learning considering input data * Self-Supe...
Autoencoder
มุมมอง 71ปีที่แล้ว
Subscribe To My Channel www.youtube.com/@huseyin_ozdemir?sub_confirmation=1 Video Contents: 00:00 What is Autoencoder? 00:51 Parts of Autoencoder 02:37 Information about Dataset 03:43 Network & Training 05:59 Dimensionality Reduction 07:00 Self-Supervised Learning * What is Autoencoder? * Parts of Autoencoder: Encoder, Bottleneck and Decoder * Information about Dataset Used To Train Autoencoder...
Logit and Probability
มุมมอง 185ปีที่แล้ว
Subscribe To My Channel www.youtube.com/@huseyin_ozdemir?sub_confirmation=1 Video Contents: 00:00 Case for Binary Classification 03:05 Case for Multi-Class Classification 05:41 Case for Multi-Label Classification * Case for Binary Classification * Case for Multi-Class Classification * Case for Multi-Label Classification All images and animations in this video belong to me #machinelearning #comp...
Binary Classification
มุมมอง 142ปีที่แล้ว
Subscribe To My Channel www.youtube.com/@huseyin_ozdemir?sub_confirmation=1 Video Contents: 00:00 Definition of Binary Classification 00:50 Binary Classification example 01:24 Binary Classification is a Supervised Learning Method 02:11 About Training & Inference Phases 03:44 Output Layer for Binary Classification 04:50 Sigmoid Activation * Definition of Binary Classification * Binary Classifica...
Multi-Class Classification
มุมมอง 131ปีที่แล้ว
Subscribe To My Channel www.youtube.com/@huseyin_ozdemir?sub_confirmation=1 Video Contents: 00:00 Definition of Multi-Class Classification 00:51 Multi-Class Classification example 01:26 Multi-Class Classification is a Supervised Learning Method 02:20 About Training & Inference Phases 03:48 Output Layer for Multi-Class Classification 04:40 Softmax Activation * Definition of Multi-Class Classific...
Multi-Label Classification
มุมมอง 671ปีที่แล้ว
Multi-Label Classification
Multi-Class vs. Multi-Label Classification
มุมมอง 771ปีที่แล้ว
Multi-Class vs. Multi-Label Classification
Loss Function
มุมมอง 67ปีที่แล้ว
Loss Function
Cost Function
มุมมอง 59ปีที่แล้ว
Cost Function
Binary Cross-Entropy Loss
มุมมอง 467ปีที่แล้ว
Binary Cross-Entropy Loss
Categorical Cross-Entropy Loss
มุมมอง 347ปีที่แล้ว
Categorical Cross-Entropy Loss
Linear Transformation
มุมมอง 2.1K2 ปีที่แล้ว
Linear Transformation
Affine Transformation
มุมมอง 9K2 ปีที่แล้ว
Affine Transformation
Projective Transformation
มุมมอง 15K2 ปีที่แล้ว
Projective Transformation
Homogeneous Coordinates
มุมมอง 4.5K2 ปีที่แล้ว
Homogeneous Coordinates
Rigid Transformation
มุมมอง 7902 ปีที่แล้ว
Rigid Transformation
Similarity Transformation
มุมมอง 1.7K2 ปีที่แล้ว
Similarity Transformation
Forward and Backward Image Warping
มุมมอง 5K2 ปีที่แล้ว
Forward and Backward Image Warping
Splatting
มุมมอง 1K2 ปีที่แล้ว
Splatting
Image Rotation
มุมมอง 9152 ปีที่แล้ว
Image Rotation

ความคิดเห็น

  • @amirezasobhdel8751
    @amirezasobhdel8751 10 วันที่ผ่านมา

    This was awesome!

  • @bonbonpony
    @bonbonpony 2 หลายเดือนก่อน

    09:04 So by dividing by z, are we projecting the rotated/sheared image back to the z=1 plane, as seen by the "eye of the camera" located at the origin below it?

  • @MasaoTaketani
    @MasaoTaketani 3 หลายเดือนก่อน

    You are really great at explaining the math of diffusion models step by step! Especially, I've seen the math starting at 15:47, but I had never understood what the authors meant, but your video explained it really well! I hope you will upload more diffusion-related topics in the future. Looking forwarding to watching more of your videos!

    • @huseyin_ozdemir
      @huseyin_ozdemir 3 หลายเดือนก่อน

      Thank you very much. I'm glad you find the video useful. I wanted to describe each detail clearly, without skipping any part of the derivations, so viewers can comprehend the whole process easily.

  • @Ashish-sp4hw
    @Ashish-sp4hw 3 หลายเดือนก่อน

    Mathematics from scratch was something which I couldn't find anywhere else. Thank you for making this awesome video ❤. But I didn't understand the following . 1. reparameterisation part 2. How the sum of normals were calculated

  • @ምእንቲመጎጎትሕለፍኣንጭዋ
    @ምእንቲመጎጎትሕለፍኣንጭዋ 4 หลายเดือนก่อน

    Very helpful. Thanks.

  • @utkuerdogan6551
    @utkuerdogan6551 4 หลายเดือนก่อน

    Nice explanation. you helped me to solve the calibration problem in a grid detection problem. In opencv, there are methods called ".getPerspectiveTransform" and "warpPerspective". If you know the math behind, two lines of codes solve the problem.

  • @SplendidKunoichi
    @SplendidKunoichi 5 หลายเดือนก่อน

    for years, I've wished to see it explained in just this way !!

    • @huseyin_ozdemir
      @huseyin_ozdemir 5 หลายเดือนก่อน

      Thank you for your comment

  • @marufahmed3416
    @marufahmed3416 5 หลายเดือนก่อน

    Very good visual explanation, thanks very much.

    • @huseyin_ozdemir
      @huseyin_ozdemir 5 หลายเดือนก่อน

      Glad it was helpful!

  • @talon6277
    @talon6277 5 หลายเดือนก่อน

    Very helpful, well explained Thank you!

  • @ercancetin6002
    @ercancetin6002 6 หลายเดือนก่อน

    Güzel çalışma

  • @ajkdrag
    @ajkdrag 6 หลายเดือนก่อน

    Can you do video on detr and yolo new models?

  • @doublesami
    @doublesami 6 หลายเดือนก่อน

    very good explanation, Could you please make a video on vmamba or Vision mamba to understand it in depth , like how selective scan 2d works etc , looking forward

  • @曹靖婕
    @曹靖婕 7 หลายเดือนก่อน

    Thanks for your detailed explanation!

  • @zaharvarfolomeev1536
    @zaharvarfolomeev1536 8 หลายเดือนก่อน

    Thank you! I liked your video more than anyone else on the topic of momentum.

  • @ivannasha5556
    @ivannasha5556 8 หลายเดือนก่อน

    Thanks! I was experimenting with IFS fractals 30+ years ago. Did not remember much and google was no help. Everyone is just listing the basic known and nobody else explains the math to make your own.

  • @arinmahapatro61
    @arinmahapatro61 11 หลายเดือนก่อน

    Insightful !

  • @dhirajkumarsahu999
    @dhirajkumarsahu999 ปีที่แล้ว

    Thanks a lot

  • @gneil1985
    @gneil1985 ปีที่แล้ว

    Great insights into the perspective transformation. Very clear explanation.

  • @sixface20
    @sixface20 ปีที่แล้ว

    Great tutorial

  • @张文杰-b8s
    @张文杰-b8s ปีที่แล้ว

    Perfect presentation!

  • @thatguy5787
    @thatguy5787 ปีที่แล้ว

    This is fantastic. Very well done.

  • @ercancetin6002
    @ercancetin6002 ปีที่แล้ว

    Bu kadar özenli bir çalışmanın bu kadar az ilgi görmesi üzücü. Başarılar diliyorum kardeşim.

    • @huseyin_ozdemir
      @huseyin_ozdemir ปีที่แล้ว

      Yorumunuz için teşekkür ederim. Kanalım için yaptığım çalışmalar özelinde değil de daha geniş manasıyla bakacak olursak, hayatın bana öğrettiği şeylerden biri de her çabanın her fiilin bir karşılığı olduğu. Bazen hemen olur, bazen zaman alır. Bazen direkt olur, bazen dolaylı yollardan.

  • @mehmetozkan1075
    @mehmetozkan1075 ปีที่แล้ว

    ABSOLUTELY GOOD JOB. THANK YOU SO MUCH

  • @mehmetozkan1075
    @mehmetozkan1075 ปีที่แล้ว

    It's great that you added this lesson as well. Thanks a lot.

    • @huseyin_ozdemir
      @huseyin_ozdemir ปีที่แล้ว

      Thank You. I think YOLOv1, YOLOv2 and YOLOv3 are important to understand how to address object detection in single pass formulating it as a regression problem.

  • @mehmetozkan1075
    @mehmetozkan1075 ปีที่แล้ว

    It is really a very simple and understandable series. The series is easy to understand and follow. It would be great if you could include courses on OpenCV, advanced computer vision, and Kaggle project solutions. Thank you for all your hard work.

  • @krimafarjallah7553
    @krimafarjallah7553 ปีที่แล้ว

    💯🤍

  • @denischikita
    @denischikita ปีที่แล้ว

    I didn't got. How input depth became from 3 to 32?

    • @huseyin_ozdemir
      @huseyin_ozdemir ปีที่แล้ว

      Those are two different examples. In the first one, at 09:22 of the video, an RGB image is convolved with a 3×3 filter. Since RGB image has 3 channels, convolution filter should also have 3 channels. This is a typical filtering operation in an image processing application. The second example, at 12:08 of the video, is more generic, a convolution operation at a convolutional layer is illustrated. That's why, in the video, it's written "Let our input image depth be 32".

  • @denischikita
    @denischikita ปีที่แล้ว

    Thank you. I resect your original attitude to teach such complex topic. It helped me to place right things to my mind.

  • @vivekrai1974
    @vivekrai1974 ปีที่แล้ว

    Very Informative Video. I see that you have covered various topics like mathematics of transformation, supervised learning etc. in your various videos. If you create playlists, it would be easier for the viewers.

  • @mfatihaydogdu7
    @mfatihaydogdu7 ปีที่แล้ว

    It would be very helpful to generate playlists .

  • @muhittinselcukgoksu1327
    @muhittinselcukgoksu1327 ปีที่แล้ว

    I congratulate your Digital Image Processing videos. When commercial products are everywhere , then detailed and explanatory videos are easily accessible datum. Thank you so much.

  • @dinezeazy
    @dinezeazy ปีที่แล้ว

    Man i really love how you are fusing different topics in single video!! Then have a separate topic for that particular video. This is great.

    • @huseyin_ozdemir
      @huseyin_ozdemir ปีที่แล้ว

      Glad you like the videos. Thanks for the comment.

  • @milanm4772
    @milanm4772 ปีที่แล้ว

    Nicely. Best explained.

  • @dinezeazy
    @dinezeazy ปีที่แล้ว

    This is amazing, please do more of these, camera calibration also with example and from there what and can be achieved using the calibration like solving parallax problem, estimating object distance etc. With you kind of slow and steady explanation everyone will be able to understand.

  • @mizzonimirko
    @mizzonimirko ปีที่แล้ว

    I do not fully understand how jt works honestly. Given a batch, the output of that hidden layer should be dimension_batch* dimension _output? It follows that mean / variance shouldn't be vectors?

    • @huseyin_ozdemir
      @huseyin_ozdemir ปีที่แล้ว

      Hi, batch normalization can be confusing at first glance. Never mind. Let's say we have a fully connected layer with n neurons. If batch size is m, then each neuron outputs m values for 1 batch of inputs. Mean and variance for that neuron for that batch are computed using those m outputs as described in 09:01 of the video. So mean and variance are scalars and are computed for each batch during training. And one important thing to note is that while computing mean and variance for 1 neuron, only outputs of that neuron are used.

  • @irshadirshu0722
    @irshadirshu0722 ปีที่แล้ว

    Nice explanation ❤

  • @villagelifebangladesh9636
    @villagelifebangladesh9636 ปีที่แล้ว

    i dont hear any audio...dont know why

    • @huseyin_ozdemir
      @huseyin_ozdemir ปีที่แล้ว

      I prepared some videos without voiceover. But, that's not an issue :) Each video is fully self-contained.

  • @srihithbharadwaj3421
    @srihithbharadwaj3421 ปีที่แล้ว

    does forward warping need the depth information

  • @wolfgangbierling
    @wolfgangbierling ปีที่แล้ว

    Great work! Thank you for this clear explanation!

  • @cathycai9167
    @cathycai9167 ปีที่แล้ว

    thank you for such clear video! It really saved me :)

  • @z3515535
    @z3515535 ปีที่แล้ว

    This is a good video. I am currently searching on implementation of deconvolution using tensorflow. Did you use tensorflow for your implementation? If so, can you share the code?

  • @FelLoss0
    @FelLoss0 ปีที่แล้ว

    Silent video?

    • @huseyin_ozdemir
      @huseyin_ozdemir ปีที่แล้ว

      When I first started my channel, I prepared some videos without voiceover. But, I can assure you, those videos, too, include all necessary information and detail as text, diagrams and images to understand the related concepts.

  • @waterspray5743
    @waterspray5743 ปีที่แล้ว

    Thank you for making everything concise and straight to the point.

    • @huseyin_ozdemir
      @huseyin_ozdemir ปีที่แล้ว

      Thank You for your comment. Glad you liked the video.

  • @aaryannakhat1004
    @aaryannakhat1004 ปีที่แล้ว

    Thanks a lot! Was facing difficulty in understanding how mini-batch standard deviation helps prevent mode collapse until I saw this video! Really appreciate it! Great work!

  • @dyyno5578
    @dyyno5578 ปีที่แล้ว

    thank you very much for the clear explanation!

  • @mohammadyahya78
    @mohammadyahya78 ปีที่แล้ว

    Third question please, at 5:13, what do you mean by modulation weights please?

  • @mohammadyahya78
    @mohammadyahya78 ปีที่แล้ว

    Thank you again. You mentioned at 4:10 that there is a dimension is reduced by reductuon ratio r

    • @huseyin_ozdemir
      @huseyin_ozdemir ปีที่แล้ว

      Reduction ratio r is used to create a bottleneck. This way, network is forced to learn which channels are important. Then unimportant channels are suppressed scaling them with modulation weights.

  • @mohammadyahya78
    @mohammadyahya78 ปีที่แล้ว

    Thank you very much. May I know what is the modulation weight please at 2:11?

    • @huseyin_ozdemir
      @huseyin_ozdemir ปีที่แล้ว

      Modulation weight scales a channel depending on the importance of the channel. So following layers focus on important information.

  • @muhtasirimran
    @muhtasirimran ปีที่แล้ว

    Any link to understand why 2nd part works?

  • @AJ-et3vf
    @AJ-et3vf ปีที่แล้ว

    Awesome video. Thank you