https://github.com/zhanggang001/HEDNet
Introduction
- yielded 2.1% mAP gains over the previous best sparse detector FSD V2, 23 while being 1.3× faster.
- surpassed the previous best hybrid detector HEDNet, NeurIPS 23 by 2.6% mAP while being 2.1× faster;
- VoxelNeXt, cvpr23 directly predicts objects based on the features nearest to object centers but exhibits inferior accuracy
- simple.

Related work
both SWFormer and VoxelNeXt exhibit inferior accuracy compared to hybrid detectors
Method
HEDNet, NeurIPS 23 vs SAFDNet 24 oral


-
voxel feature encoder (VFE): same
-
SSR vs SRB: same
- submanifold sparse residual (SSR) : HEDNet, NeurIPS 23
- Sparse residual (SRB): 就是SSR
- Most voxel-based methods [4, 7, 18] adopt sparse CNNs to extract features. These CNNs typically comprise a series of sparse residual blocks, where each block contains two submanifold sparse convolutions and a skip connection linking its input and output.
- SSR ⇒ result in a receptive field with limited size
-
SED vs 3D-EDB
- sparse encoder-decoder (SED) : HEDNet, NeurIPS 23 靠downsampling & upsampling增加receptive field
- 3D Sparse encoder-decoder (EDB): 就是SED
-
2D DED vs 2D EBD
- 2D dense encoder-decoder (DED): HEDNet, NeurIPS 23
- 2D Sparse encoder-decoder (EDB):
- 3D EDB用的3D Submanifold sparse convolution, 2017
- 2D EDB用的2D submanifold sparse convolutions,结构一样。
-
adaptive feature diffusion (AFD):就这一个区别,为了AFD要做voxel classification
-
assign a larger diffusion range to voxels within the bounding boxes of large objects
-
assigning a smaller range to voxels within the bounding boxes of small objects or the background

-
hence voxel classification is necessary.
-
tab 6: 扩散比不扩散好很多;AFD比UFD好一点,但是效率只比不扩散差一点,综合很棒。

-
AFD work well on three 3D sparse backbones: HEDNet, VoxelNet, and PillarNet.

-
Sparse detection head
- center heatmap: using the voxels within which object centers fall, as done in CenterPoint, to calculate the Gaussian heatmap during classification training led to rapid convergence of the classification loss to zero.
- Nearest heatmap: use the nearest nonempty voxel as the center to generate the Gaussian heatmap.
- tab 6: Nearest heatmap比center heatmap好太多: 55.2 ⇒ 71.0% mAPH