Abstract

better LiDAR-to-camera projected depth + image feature ⇒ better depth estimation
自监督预测uv offset per per BEV pixel
actually no graph network.

Untitled

inaccurate calibration between LiDAR and camera

Untitled

LocalAlign: addressing local misalignment due sensor calibration errors.
GlobalAlign: we only add offset noise to the GlobalAlign module during training to simulate global misalignment issues.

Untitled

LocalAlign
1. Dual Transform 误导
  1. LiDAR-to-camera provides projected depth;
  2. 其实没有graph，只有每个像素的depth和kNN neighbor的k个depth
Dual-Depth + image feature ⇒ better depth for Bev

Untitled

GlobalAlign

input: MM BEV + noise
1. 噪声加在image bev feature上，去预测offset uv per pixel.
output: DeformBEV
GT: MM Bev卷一下得到的SupBEV