Abstract
- better LiDAR-to-camera projected depth + image feature ⇒ better depth estimation
- 自监督预测uv offset per per BEV pixel
- actually no graph network.

inaccurate calibration between LiDAR and camera

- LocalAlign: addressing local misalignment due sensor calibration errors.
- GlobalAlign: we only add offset noise to the GlobalAlign module during training to simulate global misalignment issues.

- LocalAlign
- Dual Transform 误导
- LiDAR-to-camera provides projected depth;
- 其实没有graph,只有每个像素的depth和kNN neighbor的k个depth
- Dual-Depth + image feature ⇒ better depth for Bev

GlobalAlign
- input: MM BEV + noise
- 噪声加在image bev feature上,去预测offset uv per pixel.
- output: DeformBEV
- GT: MM Bev卷一下得到的SupBEV