Features from 2 views better than 3d?
精度太低。A6000,1*48G训练了30天?
https://github.com/haibo-qiu/GFNet
#6 of SementicKITTI, but very simple
差7个点
https://paperswithcode.com/sota/3d-semantic-segmentation-on-semantickitti


Figure 2: The distant occluded points caused by RV projection are misclassified as the labels of near displayed points in the red rectangle areas, while they are totally captured by BEV. By propagating the information between BEV and RV, this issue can be well addressed by our GFNet.
Show Figure 5.

融合两种图像特征到3D点云,然后3D卷积⇒3D Boxes
特征融合,可以加结果ensemble吗Multimodal Object Detection via Probabilistic Ensembling, eccv 22 ?
backbone is resnet,试试我们的lightweight
More modules?
用了5个监督:
2 2d boxes losses, 2 3d boxes losses, + 1 final 3d box loss
geometric flow module (GFM)利用了3D和RV,BEV之间的逐点的对应关系,+简单融合,如何提高feature融合?
