Untitled

Example of “submanifold” dilation. Left: Original curve. Middle: Result of applying a regular 3 × 3 convolution with weights 1/9. Right: Result of applying the same convolution again.

SSC(·, ·, 3) receptive field centered at different active spatial locations. Active locations in the field are shown in green. Red locations are ignored by SSC so the pattern of active locations remains unchanged.

How

  1. A sparse input corresponds to a d-dimensional grid of sites that is associated with a feature vector. We define a site in the input to be active if any element in the feature vector is not in its ground state, for instance, if it is non-zero.

    1. In many practical problems, thresholding may be used to eliminate sites at which the feature vector is within a very small distance from the ground state.
  2. sparse convolution SC(m, n, f, s) with m input feature planes, n output feature planes, a filter size of f, and stride s.

    1. If the input has size l then the output will have size (l− f + s)/s.

    Untitled

    Untitled

    1. only operate on input active sites.
      1. 看起来还有dilation的问题,因为没有像SSC一样限制输出。
      2. Whereas this is a seemingly small change to the convolution operation, it may bring computational benefits in practice.
    2. Max-pooling MP(f, s) and average pooling AP(f, s) operations are defined as a variant of SC(·, ·, f, s).
  3. submanifold sparse convolution SSC(m, n, f) as a modified SC(m, n, f, s = 1)

    1. pad the input with (f − 1)/2 zeros on each side, so that the output will have the same size as the input.
    2. we restrict an output site to be active if and only if the site at the corresponding site in the input is active (i.e., if the central site in the receptive field is active).
    3. Whenever an output site is determined to be active, its output feature vector is computed by the SSC convolution.
  4. Submanifold FCNs and U-Nets

    Untitled

    1. conv: CNN?
    2. conv block: SSC(·, ·, 3)
    3. conv-s2: SC(·, ·, 2, 2)
    4. dconv-s2: DC(·, ·, 2, 2)
  5. Deconvolution / Transpose Convolution ****DC(·, ·, f, s)

    1. deconvolution operation DC(·, ·, f, s) as an inverse of SC(·, ·, f, s). The set of active output sites from a DC convolution is exactly the same as the set of input active sites to the corresponding SC convolution: the connections between input and output sites are simply inverted. 好模糊。可以按照正常的deconvolution理解,但是active sites没有增多。
    2. Transpose Convolution With Stride 2, No Padding

    Untitled

    The stride option is used to set how far apart the original cells are in the intermediate grid.

    “stride=1” is used to do the CNN from intermediate grid to output.

Discussion

本质上就是限制CNN的input和output,再加上基于hash table的高效实现。

References

  1. [30] Submanifold sparse convolutional networks, 2017.
  2. [31] 3d semantic segmentation with submanifold sparse convolutional networks, CVPR, 2018.
  3. http://makeyourownneuralnetwork.blogspot.com/2020/02/calculating-output-size-of-convolutions.html