TL;DR: A straight flow-based generative model for fast LiDAR data generation, named R2Flow
Abstract
Building LiDAR generative models holds promise as powerful data priors for restoration, scene manipulation, and scalable simulation in
autonomous mobile robots. In recent years, approaches using diffusion models have emerged, significantly improving training stability and
generation quality. Despite their success, diffusion models require numerous iterations of running neural networks to generate high-quality
samples, making the increasing computational cost a potential barrier for robotics applications. To address this challenge, this paper
presents R2Flow, a fast and high-fidelity generative model for LiDAR data. Our method is based on rectified flows that learn straight
trajectories, simulating data generation with significantly fewer sampling steps compared to diffusion models. We also propose an efficient
Transformer-based model architecture for processing the image representation of LiDAR range and reflectance measurements. Our experiments on
unconditional LiDAR data generation using the KITTI-360 dataset demonstrate the effectiveness of our approach in terms of both efficiency and
quality.
Method Overview
What kind of framework is suitable for representing LiDAR data?
Recent advances in LiDAR generative models are driven by diffusion models based on the range image representation. Among them, LiDAR diffusion
models currently follow two approaches for generating range images: pixel-space iteration and feature-space iteration, inspired by a natural
image domain. We prioritize pixel precision required for range images and employ the pixel-space iteration approach.
Pixel-space iteration [Ho et al. 2020]
Feature-space iteration [Rombach et al. 2022]
Architecture
Diffusion modeling on high-dimensional pixels
Diffusion modeling on lower-dimensional features compressed by autoencoders (AEs)
Pros
Finer details via direct iterative modeling
The iterative space is low-dimensional; high throughput
Cons
The iterative space is high-dimensional; low throughput
Blurriness due to the extra AE decoding
LiDAR application
LiDARGen [Zyrianov et al. 2022]
R2DM [Nakashima et al. 2024] R2Flow (ours)
LiDM [Ran et al. 2024]
How can we reduce the number of sampling steps?
Sampling of diffusion models requires iterative evaluations of neural networks such as U-Net. The overall throughput is highly dependent on how
much the number of iterations can be reduced. We address this issue by adopting rectified flows
[Liu et al. 2023] which are robust to the number of sampling steps.
Diffusion models [Song et al. 2021]
Rectified flows [Liu et al. 2023]
Formulation
Stochastic differential equations (SDEs)
Ordinary differential equations (ODEs)
Trajectory
Probabilistic & curved (prone to discretization errors)
Deterministic & straight (easy to approximate with few steps)
LiDAR application
LiDARGen [Zyrianov et al. 2022]
R2DM [Nakashima et al. 2024]
LiDM [Ran et al. 2024]
R2Flow (ours)
What if the pixel-space iteration is completed in just a few steps?
By combining the pixel-space iteration approach with rectified flows, we demonstrated both high geometric accuracy and generation efficiency.
Moreover, we also propose a Vision Transformer-based architecture for the neural network that learns flows in pixel space. We modify HDiT
(hourglass diffusion transformer) [Crowson et al. 2024] to represent LiDAR range &
reflectance (intensity) images.
Results
Unconditional generation
The following compares LiDAR generation models trained on the KITTI-360 dataset. For the
diffusion-based methods (LiDM & R2DM) and R2Flow, we show samples generated with many steps and few steps.
R2Flow shows consistent quality over different number of steps. Please see
our paper for quantitative results on generation quality and computational costs.
Real data
1 step (fixed)
DUSty v2 (GAN) [Nakashima et al. 2023]
1160 steps (fixed)
LiDARGen (diffusion) [Zyrianov et al. 2022]
200 steps
2 steps
LiDM (diffusion) [Ran et al. 2024]
256 steps
2 steps
R2DM (diffusion) [Nakashima et al. 2024]
256 steps
2 steps
R2Flow (rectified flows) [Ours]
Note: LiDARGen runs on the fixed noise schdule. LiDM only produces the range modality.
Citation
@inproceedings{nakashima2025fast,
title = {Fast LiDAR Data Generation with Rectified Flows},
author = {Kazuto Nakashima and Xiaowen Liu and Tomoya Miyawaki and Yumi Iwashita and Ryo Kurazume},
booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
pages = {},
year = 2025
}
Acknowledgments
This work was supported by
JSPS KAKENHI Grant Number JP23K16974
and JSPS KAKENHI Grant Number JP20H00230
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole
ICLR 2021
https://arxiv.org/abs/2209.03003
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
Xingchao Liu, Chengyue Gong, Qiang Liu
ICLR 2023
https://arxiv.org/abs/2209.03003
Generative Range Imaging for Learning Scene Priors of 3D LiDAR Data
Kazuto Nakashima, Yumi Iwashita, Ryo Kurazume
WACV 2023
https://arxiv.org/abs/2210.11750
LiDAR Data Synthesis with Denoising Diffusion Probabilistic Models
Kazuto Nakashima, Ryo Kurazume
ICRA 2024
https://arxiv.org/abs/2309.09256
Fast LiDAR Data Generation with Rectified Flows
Kazuto Nakashima, Xiaowen Liu, Tomoya Miyawaki, Yumi Iwashita, Ryo Kurazume
ICRA 2025
https://arxiv.org/abs/2412.02241
Learning to Generate Realistic LiDAR Point Clouds
Vlas Zyrianov, Xiyue Zhu, Shenlong Wang
ECCV 2022
https://arxiv.org/abs/2209.03954
Towards Realistic Scene Generation with LiDAR Diffusion Models
Haoxi Ran, Vitor Guizilini, Yue Wang
CVPR 2024
https://arxiv.org/abs/2404.00815
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Katherine Crowson, Stefan Andreas Baumann, Alex Birch, Tanishq Mathew Abraham, Daniel Z. Kaplan, Enrico Shippole
ICML 2024
https://arxiv.org/abs/2401.11605
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer
CVPR 2022
KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D
Yiyi Liao, Jun Xie, Andreas Geiger
TPAMI 2022
https://arxiv.org/abs/2109.13410
Development of a Realistic LiDAR Simulator based on Deep Generative Models
Kazuto Nakashima
Grant-in-Aid for Early-Career Scientists,
The Japan Society for the Promotion of Science (JSPS)
Development of garbage collecting robot for marine microplastics
Ryo Kurazume, Akihiro Kawamura, Qi An, Shoko Miyauchi, Kazuto Nakashima
Grant-in-Aid for Scientific Research (A),
The Japan Society for the Promotion of Science (JSPS)