Expanding the Waymo Open Dataset with new labels and challenges

We launched the Waymo Open Dataset back in 2019 as one of the largest and most diverse autonomous driving datasets ever released for research. At the time, it consisted of multimodal sensor data from 1,000 driving segments. As a result of the overwhelmingly positive reception and high engagement, we have continuously evolved the dataset beyond its initial scope by almost doubling our Perception dataset size and introducing a Motion dataset enabling prediction tasks. The Waymo Open Dataset remains one of the most complete and comprehensive autonomous driving datasets, contributing to 500+ publications and providing high-quality data, which is complex and resource intensive to gather, to the research and academic community.

Today, we are augmenting the Waymo Open Dataset with additional labels to expand the set of tasks researchers can explore. This expansion includes:

Key point labels. Key points and pose estimation can be a valuable addition to perception and behavior prediction models. They capture important small nuances, like detecting a cyclist’s turn gesture. We believe our key point label release is the most comprehensive public dataset of its kind available for the autonomous driving space. We are excited to see what the broader research community will do with it to advance the field of human pose estimation.
3D Segmentation labels. Segmentation has been a significant asset to the research community for quite some time; however, most public datasets for the autonomous driving space only include bounding boxes to represent and classify objects which can lead to important information being left out of the equation. Segmentation labeling can be used to detect each pixel of an image and each return of a lidar point cloud and classify it to an object. We are enabling this amazing level of granularity for researchers by adding 3D segmentation labels for 23 classes of 1,150 segments of the Waymo Open Dataset.
2D-to-3D Bounding box correspondence. It can be challenging or ambiguous to associate 2D camera bounding boxes and corresponding 3D boxes in lidar labels. We are adding canonical 2D-to-3D bounding box correspondence labels, in order to further enable research on sensor fusion of object detection and understanding.

In addition to all these new resources, we are excited to kick off the 2022 Waymo Open Dataset Challenges, featuring:

Motion Prediction: Given agents' tracks for the past 1 second on a corresponding map, predict the positions of up to 8 agents for 8 seconds into the future. We’re also introducing a new metric, named ‘soft mAP’, to better capture model performance.
Occupancy and Flow Prediction: Given agents' tracks for the past 1 second on a corresponding map, predict the bird's-eye view (BEV) occupancy and flow of all currently-observed and currently-occluded vehicles for 8 seconds into the future. Read our paper to learn more about the problem definition and Waymo's work on occupancy and flow prediction.
3D Semantic Segmentation: Given one or more lidar range images and the associated camera images, produce a semantic class label for each lidar point.
3D Camera-only Detection: Given one or more images from multiple cameras, produce a set of 3D upright boxes for the visible objects in the scene. [COMING SOON]

The winner for each challenge will receive a $15,000 cash award, with the second place winner receiving $5,000 and the third place winner receiving $2,000.

You can find the rules for participating in the challenges here. The challenges close at 11:59 PM Pacific on May 23, 2022, but the leaderboards will remain open for future submissions. We’re also inviting some participants to present their work at the Workshop on Autonomous Driving at CVPR, scheduled for June 20, 2022.

Good luck and we can’t wait to see what you will accomplish!