- 3D Object Detection and Tracking using center points in the bird-eye view.
- It converts the sparse output of a backbone network into a bird-eye view dense feature map and predicts a dense heatmap of the center locations of objects
- CenterPoint outperforms all previous single model method by a large margin and ranks first among all Lidar-only submissions.
- https://paperswithcode.com/paper/center-based-3d-object-detection-and-tracking
- 2024: BevFusion的最好;
- https://github.com/tianweiy/CenterPoint
1 Introduction
1.1 Motivation
1.1.1 Rotation misalignment


- both anchor based and our center-based method are able to detect objects accurately (top).
- However, during a safety-critical left turn (bottom), anchor-based methods have difficulty fitting axis aligned bounding boxes to rotated objects.
- Our center-based model accurately detect objects through rotationally invariant points.
- why? both of them predicate box from local features.
- key advantages
- unlike bounding boxes, points have no intrinsic orientation ⇒ dramatically reduces the object detector’s search space while allowing the backbone to focus on …
- a center-based representation simplifies downstream tasks such as tracking. If objects are points, tracklets are paths in space and time
- point-based feature extraction enables us to design an effective two-stage refinement module that is much faster than previous approaches [44–46]
优势
yaw角偏离很多,长宽比很大,大小极端的case会表现更好
2 Method
2.1 detection

-
3D Backbone: VoxelNet [56, 66] or PointPillars [28].
- The output of the 3D backbone is called BEV Feautres.

-
image-based keypoint detector to find object centers: Objects as Points, 19