
- generates 3D object proposals from bird’s eye view map and project them to three views
- M: Multimodal Fusion
- hand-crafted features

- We discretize the projected point cloud into a 2D grid with resolution of 0.1m.
- hand-crafted: For each cell, the height feature is computed as the maximum height of the points in the cell.
- To encode more detailed height information, the point cloud is devided equally into M slices. A height map is computed for each slice, thus we obtain M height maps.
- hand-crafted: The intensity feature is the reflectance value of the point which has the maximum height in each cell.
- density: indicates the number of points in each cell
- ⇒ the bird’s eye view map is encoded as (M +2)-channel features.
References
Multi-view 3d object detection network for autonomous driving, CVPR 2017