Depthwise Separable Convolution

In the nodding RNN course, the teacher assigned homework to rewrite AlexNet using depthwise separable convolution.

Huh? Depthwise separable convolution? What is that? After surfing for a while, I found some good content below.

The concept is explained very well, along with background knowledge expansion, but the code demonstration is based on Keras. I will supplement with PyTorch implementation videos later (I have finished understanding the principles and skimmed through the code part).

PyTorch code implementation (don’t understand)

V1 highlights

V2 highlights (don’t understand)

Inverted residual structure

  • First use PW (pointwise convolution) to increase dimensions, then DW (depthwise convolution), then PW to reduce dimensions

ReLU6

Linear bottlenecks

V3 highlights (don’t understand)

Updated Block (bneck)

  • Added SE (Squeeze-and-Excitation) module

Using NAS (Neural Architecture Search) to search parameters
Redesigned time-consuming layer structure