Abstract

  1. better LiDAR-to-camera projected depth + image feature ⇒ better depth estimation
  2. 自监督预测uv offset per per BEV pixel
  3. actually no graph network.

Untitled

inaccurate calibration between LiDAR and camera

Untitled

  1. LocalAlign: addressing local misalignment due sensor calibration errors.
  2. GlobalAlign: we only add offset noise to the GlobalAlign module during training to simulate global misalignment issues.

Untitled

  1. LocalAlign
    1. Dual Transform 误导
      1. LiDAR-to-camera provides projected depth;
      2. 其实没有graph,只有每个像素的depth和kNN neighbor的k个depth
  2. Dual-Depth + image feature ⇒ better depth for Bev

Untitled

GlobalAlign

  1. input: MM BEV + noise
    1. 噪声加在image bev feature上,去预测offset uv per pixel.
  2. output: DeformBEV
  3. GT: MM Bev卷一下得到的SupBEV