PV-RCNN, cvpr20

Untitled

PV: Point-based + Voxel-based feature learning methods.

voxel-based networks: efficiently encodes multi-scale feature representations. +
PointNet-based networks: preserves accurate location information with flexible receptive fields.
a two step strategy: the voxel-to-keypoint 3D scene encoding + the keypoint-to-grid RoI feature abstraction

Untitled

Predicted Keypoint Weighting

keypoints by the Further Point Sampling strategy.

keypoints belonging to the foreground objects should contribute more to the accurate refinement of the proposals, while the ones from the background regions should contribute less.

Untitled

Keypoint-to-grid RoI Feature Abstraction for Proposal Refinement

uniformly sample 6 × 6 × 6 grid points within each 3D proposal,

Untitled

Experiments

KITTI: batch size 24, learning rate 0.01 for 80 epochs on 8 GTX 1080 Ti
Waymo Open Dataset: batch size 64, learning rate 0.01 for 50 epochs on 32 GTX 1080 Ti
Table 6, 7, 8 分析了Voxel CNN, keypoints, Rol-grid pooling, 以及每个特征的贡献。

References

PV-RCNN: Point-voxel feature set abstraction for 3D object detection. cvpr20

Shaoshuai Shi, Chaoxu Guo, Li Jiang, Zhe Wang, Jianping Shi, Xiaogang Wang, and Hongsheng Li.