Untitled

generates 3D object proposals from bird’s eye view map and project them to three views
M: Multimodal Fusion
hand-crafted features

Untitled

We discretize the projected point cloud into a 2D grid with resolution of 0.1m.
1. hand-crafted: For each cell, the height feature is computed as the maximum height of the points in the cell.
2. To encode more detailed height information, the point cloud is devided equally into M slices. A height map is computed for each slice, thus we obtain M height maps.
hand-crafted: The intensity feature is the reflectance value of the point which has the maximum height in each cell.
density: indicates the number of points in each cell
⇒ the bird’s eye view map is encoded as (M +2)-channel features.

References

Multi-view 3d object detection network for autonomous driving, CVPR 2017