In Spatially Separable kernel I don't understand how the actual number of multiplications are reduced. Lets say we have a 3x3 kernel and a 3x3 input. The first vector will perform convolution with the 3 rows of the input, right? 1 convolution involves 3 multiplication and 3 convolutions will require 3x3 = 9 multiplications. I feel that the reduction will happen only if there were matrix multiplication taking place. But in convolutions we are doing element-wise multiplication and hence we can't see reduction in cost, unless you are taking into account things like parallelization or broadcasting.
perhaps look at it this way, say the final output of a convolution operation has Z datapoints, and k is kernel size, then for each of Z points, we require k*k multiplications and additions, so total number of operations is Z*k*k, but in separable convolutions, for convolution with first 1d filter, we get Z*n multiplications and additions, and for second we get another Z*n multiplication and additions .. so total number of operations will be 2*Z*n
Not atrous, atross convolution : à trous convolution (from french, à trous meaning with holes !)
Pronounce à trou (like hou hou)
In Spatially Separable kernel I don't understand how the actual number of multiplications are reduced. Lets say we have a 3x3 kernel and a 3x3 input. The first vector will perform convolution with the 3 rows of the input, right? 1 convolution involves 3 multiplication and 3 convolutions will require 3x3 = 9 multiplications. I feel that the reduction will happen only if there were matrix multiplication taking place. But in convolutions we are doing element-wise multiplication and hence we can't see reduction in cost, unless you are taking into account things like parallelization or broadcasting.
perhaps look at it this way, say the final output of a convolution operation has Z datapoints, and k is kernel size, then for each of Z points, we require k*k multiplications and additions, so total number of operations is Z*k*k, but in separable convolutions, for convolution with first 1d filter, we get Z*n multiplications and additions, and for second we get another Z*n multiplication and additions .. so total number of operations will be 2*Z*n
Good Video but a question. Will the intermediate image incurs memory in the separable convolution?
The one needed the most explanation (transposed convolution) was explained briefly and very badly!
isn't 2d a data type of cnn the others data types are 1d and 3d