Basics of Computer Vision - An Overview| Learning computer vision principles with examples | Class 1

แชร์
ฝัง
  • เผยแพร่เมื่อ 7 ม.ค. 2023
  • To test your understanding, answer the questions below.
    Simple explanation of convolutional neural network | Deep Learning (Pytorch & Python) Basics of Computer Vision | Learning human eye to Digital Camera | Easy Deep learning| understanding how computer vision starts.
    In this video, I'm going to be introducing you to the basics of computer vision. We'll be covering topics like training and ANNs, object recognition, and affective computing. In the first part of the class, we'll be covering the basics of machine learning. We'll be covering topics like neural networks, error backpropagation, and gradient descent. In the second part of the class, we'll be implementing these concepts into a real-world application. We'll be using OpenCV to learn how to recognize objects and read text. I hope you find this video helpful and that it helps you to understand the basics of computer vision!
    / building-the-intuition...
    Answer the questions below if you understood the concept:
    Question 1:
    What are Channels and Kernels?
    Why should we (nearly) always use 3x3 kernels?
    What happens during the training of a DNN?
    Why Receptive field is important to us ?
    Select the odd one out.
    1. Kernel.
    2. A 3x3 matrix that is used to slide or convolve on an image.
    3. Feature Extractor
    4. Feature
    5. Channel
    What should the last layer's pixel's receptive field be?
    1. Equal to the size of the kernel
    2. 400x400
    3. Equal to the size of the image
    What are channels?
    1. Collection of same features
    2. Collection of all neurons that contain information about a specific feature
    3. Output of a kernel
    4. Collection of all neurons that contain information about multiple different feature
    What sequence does a neural network's feature extraction follow??
    1. (Starting Features)
    2. (Slightly complex features)
    3. (More complex features)
    4. (Very complex features)
    1. Textures and Patterns
    2. Edges and Gradients
    3. Objects
    4. Part of Objects

ความคิดเห็น • 15

  • @anupagrawal07
    @anupagrawal07 ปีที่แล้ว +2

    What are Channels and Kernels? ->
    Channels consist of number of kernels which are convoluting on image and Kernels are the weight that helps in extracting the features from image in the form of edges gradient, patterns parts of object, and Object.

    • @BionicBee
      @BionicBee  ปีที่แล้ว +1

      By convolving on top of an image, kernels are utilised to extract various features from the image. A kernel is a compact matrix that can carry out a variety of operations on an image. The kernel may be 3x3, 5x5, 7x7, or any other size.
      We can extract characteristics like vertical and horizontal edges, gradients, and more using the kernel. Kernels are also used in blurring, sharpening, edge detection. The following code employs a kernel that serves as the image's edge extractor.
      ```python
      import cv2
      import numpy as np
      #your image
      img = cv2.imread(r'Path\\*.png')
      '''
      Edge detection Kernal
      -1 -1 -1
      -1 8 -1
      -1 -1 -1
      '''
      x = [[-1,-1,-1],[-1,8,-1],[-1,-1,-1]]
      kernel = np.asarray(x)
      output_image = cv2.filter2D(img,-1,kernel)
      cv2.imwrite(r'Path\\blurred.png',output_image)

  • @anupagrawal07
    @anupagrawal07 ปีที่แล้ว +1

    Select the odd one out. Answer is option (5)
    1. Kernel.
    2. A 3x3 matrix that is used to slide or convolve on an image.
    3. Feature Extractor
    4. Feature
    5. Channel

  • @anupagrawal07
    @anupagrawal07 ปีที่แล้ว +1

    What sequence does a neural network's feature extraction follow??
    1. (Starting Features) - > Edges and Gradients
    2. (Slightly complex features) -> Textures and patterns
    3. (More complex features) -> Parts of Objects
    4. (Very complex features) -> Object

    • @BionicBee
      @BionicBee  ปีที่แล้ว

      That is correct. That is why we find Blocks in well-known models such as Resnet.

  • @anupagrawal07
    @anupagrawal07 ปีที่แล้ว +1

    What are channels?
    Collection of all neurons that contain information about multiple different feature

    • @BionicBee
      @BionicBee  ปีที่แล้ว

      In this context, a feature is a piece of information about gradients and edges in a picture, and channels is a collection of those features. And gradients are essentially the way that colour intensity changes.
      A coloured image has three distinct channels that are labelled Red, Blue, and Green. Other elements are also handled as channels, such as edges and gradients. One such channel is displayed in below image We employ kernels and filters to fetch each of these distinct channels. Convolution is the process of retrieving channels and feature maps.

  • @anupagrawal07
    @anupagrawal07 ปีที่แล้ว

    What should the last layer's pixel's receptive field be?
    It should be "Equal to the size of the image - reason receptive field decide is the prediction capability."

    • @BionicBee
      @BionicBee  ปีที่แล้ว

      That is correct. I will discuss it more in the Next Video

  • @anupagrawal07
    @anupagrawal07 ปีที่แล้ว

    Why Receptive field is important to us ? ->
    Receptive field is important because it’s the region in the image where the neuron gets the stimuli and it gets activated. So when neuron get activated it start capturing the information about that part of image. This captured information is nothing but your edges and gradient , patterns , parts of objects and object itself. So there are two types of the receptive field a) Local b) Global receptive field. Each layer has the Local receptive filed other than the 1st layer all other also has the global receptive field.

    • @BionicBee
      @BionicBee  ปีที่แล้ว

      That's Not 100% correct . I will cover it in Next Video.

  • @anupagrawal07
    @anupagrawal07 ปีที่แล้ว

    Why should we (nearly) always use 3x3 kernels?
    It provides the symmetry and also result in lesser number of trainable parameters.

    • @BionicBee
      @BionicBee  ปีที่แล้ว

      Another aspect to mention is that Nvidia has designed the GPUs to run 3*3 faster. If necessary, 3*3 can also function as 5*5, 7*7, and so on.

  • @anupagrawal07
    @anupagrawal07 ปีที่แล้ว

    What determines the optimal number of features to extract?
    That, How many features are actually there in the images/dataset -

    • @BionicBee
      @BionicBee  ปีที่แล้ว

      We can't Calculate How many feature Extractors we need .It;s Going to be intuitive. But it depends on Your Data , The hardware we have to train,