DUSty (2021) / DUSty v2 (2023)

Generative Range Imaging for Learning Scene Priors of 3D LiDAR Data

Kazuto Nakashima1     Yumi Iwashita2     Ryo Kurazume1
1Kyushu University     2Jet Propulsion Labratory, Caltech
WACV 2023
   
Training data from KITTI
Sampling from our learned priors

Abstract

TL;DR: We propose GAN-based resolution-free data priors for LiDAR domain adaptation

3D LiDAR sensors are indispensable for the robust vision of autonomous mobile robots. However, deploying LiDAR-based perception algorithms often fails due to a domain gap from the training environment, such as inconsistent angular resolution and missing properties. Existing studies have tackled the issue by learning inter-domain mapping, while the transferability is constrained by the training configuration and the training is susceptible to peculiar lossy noises called ray-drop. To address the issue, this paper proposes a generative model of LiDAR range images applicable to the data-level domain transfer. Motivated by the fact that LiDAR measurement is based on point-by-point range imaging, we train an implicit image representation-based generative adversarial networks along with a differentiable ray-drop effect. We demonstrate the fidelity and diversity of our model in comparison with the point-based and image-based state-of-the-art generative models. We also showcase upsampling and restoration applications. Furthermore, we introduce a Sim2Real application for LiDAR semantic segmentation. We demonstrate that our method is effective as a realistic ray-drop simulator and outperforms state-of-the-art methods.

Reconstruction

Reconstruction by the optimization-based GAN inversion [Roich et al. TOG'22].



Restoration

Corrupted data can also be restored by exploring learned scene priors.

Original
Sparse "rings"
Sparse points

Upsampling

The 1x result was obtained by reconstruction. The 2x and 4x results can be obtained just by changing coordinate queries.

Target from KITTI
1x
2x
4x

Sim2Real Semantic Segmentation

Our model can be used as a ray-drop noise simulator!
We conducted the semantic segmentation on KITTI annotated with car and pedestrian classes [Wu et al. ICRA'19].
Baseline was trained on GTA-LiDAR only (in-game simulation w/o noise). Ours was trained on GTA-LiDAR w/ our noises.

Input
GT
Baseline
Ours

Citation

@inproceedings{nakashima2023generative,
    author    = {Nakashima, Kazuto and Iwashita, Yumi and Kurazume, Ryo},
    title     = {Generative Range Imaging for Learning Scene Priors of 3D LiDAR Data},
    booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    pages     = {}
    year      = {2023}
}

Related Work

Learning to Drop Points for LiDAR Scan Synthesis
Kazuto Nakashima and Ryo Kurazume
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021
[Project] [PDF]

Acknowledgments

This work was partially supported by a Grant-in-Aid for JSPS Fellows Grant Number JP19J12159, JSPS KAKENHI Grant Number JP20H00230, and JST Moonshot R&D Grant Number JPMJMS2032.