Skip to main content

SceneDiffuser: Efficient and Controllable Driving Simulation Initialization and Rollout

Authors

  • Chiyu Max Jiang

  • Yijing Bai

  • Andre Cornman

  • Christopher Davis

  • Xiukun Huang

  • Hong Jeon

  • Sakshum Kulshresth

  • John Lambert

  • Shuangyu Li

  • Xuanyu Zhou

  • Carlos Fuertes

  • Chang Yuan

  • Mingxing Tan

  • Yin Zhou

  • Dragomir Anguelov

    Abstract

    Realistic and interactive scene simulation is a key prerequisite for autonomous
    vehicle (AV) development. In this work, we present SceneDiffuser, a scene-level
    diffusion prior designed for traffic simulation. It offers a unified framework that
    addresses two key stages of simulation: scene initialization, which involves generating initial traffic layouts, and scene rollout, which encompasses the closed-loop
    simulation of agent behaviors. While diffusion models have been proven effective in learning realistic and multimodal agent distributions, several challenges
    remain, including controllability, maintaining realism in closed-loop simulations,
    and ensuring inference efficiency. To address these issues, we introduce amortized
    diffusion for simulation. This novel diffusion denoising paradigm amortizes the
    computational cost of denoising over future simulation steps, significantly reducing
    the cost per rollout step (16x less inference steps) while also mitigating closed-loop
    errors. We further enhance controllability through the introduction of generalized
    hard constraints, a simple yet effective inference-time constraint mechanism, as
    well as language-based constrained scene generation via few-shot prompting of
    a large language model (LLM). Our investigations into model scaling reveal that
    increased computational resources significantly improve overall simulation realism. We demonstrate the effectiveness of our approach on the Waymo Open Sim
    Agents Challenge, achieving top open-loop performance and the best closed-loop
    performance among diffusion models.