Depthwise Convolution
A convolution is performed on each channel separately, with no interaction between channels
input_channel = output_channel = group
Pointwise Convolution
No convolution is performed within channels, but there is interaction between channels
kernel_size=1
Depthwise Separable Convolution
First depthwise convolution, then pointwise convolution
Group Convolution
Spatially Separable Convolution
Spatially separable convolution performs one-dimensional convolution operations on the input feature map in both horizontal and vertical directions respectively, obtaining the final output feature map. This decomposes the original two-dimensional convolution into two one-dimensional convolutions, greatly reducing the amount of computation and number of parameters. It is also more beneficial for extracting features similar to “lines.”
Dilated Separable Convolution (Dilated/Atrous Convolution)
Parameters remain the same, but the receptive field becomes larger
Transposed Convolution
Information reconstruction, enlarges input height and width, commonly used in semantic segmentation
(e.g., image super-resolution?)
1D Convolution
Processes sequential data (time, text, etc.)
3D Convolution
Processes three-dimensional data, such as height+width+depth — CT scans, or height+width+time — videos