TL;DR: A diffusion model for LiDAR data generation & completion, named R2DM
Abstract
Generative modeling of 3D LiDAR data is an emerging task with promising applications for autonomous mobile robots, such as scalable
simulation, scene manipulation, and sparse-to-dense completion of LiDAR point clouds. While existing approaches have demonstrated the
feasibility of image-based LiDAR data generation using deep generative models, they still struggle with fidelity and training stability. In
this work, we present R2DM, a novel generative model for LiDAR data that can generate diverse and high-fidelity 3D scene point clouds
based on the image representation of range and reflectance intensity. Our method is built upon denoising diffusion probabilistic models
(DDPMs), which have shown impressive results among generative model frameworks in recent years. To effectively train DDPMs in the LiDAR
domain, we first conduct an in-depth analysis of data representation, loss functions, and spatial inductive biases. Leveraging our R2DM model,
we also introduce a flexible LiDAR completion pipeline based on the powerful capabilities of DDPMs. We demonstrate that our method surpasses
existing methods in generating tasks on the KITTI-360 and KITTI-Raw datasets, as well as in the completion task on the KITTI-360 dataset.
Approach
1. We build a denoising diffusion probabilistic model (DDPM) [Ho et al. 2020] on the
equirectangular image representation with two channels: range and reflectance intensity. Each diffusion step is indexed by continuous time
[Kingma et al. 2021] so that the number of sampling steps is adjustable according to the
computational tradeoff.
2. For the reverse diffusion, Efficient U-Net [Saharia et al. 2022] is trained to
recursively denoise the latent variables z, conditioned by the beam angle-based spatial bias and the scheduled signal-to-noise ratio (SNR).
3. Trained R2DM can be used for sparse-to-dense LiDAR completion without task-specific re-training. We adopt the DDPM-based image inpainting
technique, RePaint [Lugmayr et al. 2022].
Results
Unconditional generation
We trained R2DM on the KITTI-360 dataset and performed a 256-step DDPM sampling.
Sparse-to-dense completion
We used the pre-trained R2DM and performed the RePaint-based completion.
Citation
@inproceedings{nakashima2024lidar,
title = {LiDAR Data Synthesis with Denoising Diffusion Probabilistic Models},
author = {Kazuto Nakashima and Ryo Kurazume},
booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
pages = {14724--14731},
year = 2024
}
Acknowledgments
This work was supported by
JSPS KAKENHI Grant Number JP23K16974
and JST [Moonshot R&D] [Grant Number JPMJMS2032]
Variational Diffusion Models
Diederik P. Kingma, Tim Salimans, Ben Poole, Jonathan Ho
NeurIPS 2021
https://arxiv.org/abs/2107.00630
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara
Mahdavi, Rapha Gontijo Lopes, Tim Salimans, Jonathan Ho, David J Fleet, Mohammad Norouzi
NeurIPS 2022
https://arxiv.org/abs/2205.11487
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, Luc Van Gool
CVPR 2022
https://arxiv.org/abs/2201.09865
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D
Yiyi Liao, Jun Xie, Andreas Geiger
TPAMI 2022
https://arxiv.org/abs/2109.13410
Development of a Realistic LiDAR Simulator based on Deep Generative Models
Kazuto Nakashima
Grant-in-Aid for Early-Career Scientists,
The Japan Society for the Promotion of Science (JSPS)