Here is some information explaining the specs and formats of the motion dataset.


The motion dataset is provided as sharded TFRecord format files containing protocol buffer data. The data are split into training, test, and validation sets with a split of 70% training, 15% testing and 15% validation data.

The dataset is composed of 103,354 segments each containing 20 seconds of object tracks at 10Hz and map data for the area covered by the segment. These segments are further broken into 9 second windows (1 second of history and 8 seconds of future data) with 5 second overlap. The data is provided in two forms. The first form is stored as Scenario protocol buffers. The second form converts the Scenario protos into tf.Example protos containing tensors for use in building models. Details of both formats follow at the end of the page.

Scenario proto tf.Example
Segment length 9 seconds (1 history, 8 future) 9 seconds (1 history, 8 future)
Maps Vector maps Sampled as points
Representation Single proto Set of tensors

To enable the motion prediction challenge, the ground truth future data for the test set is hidden from challenge participants. As such, the test sets contain only 1 second of history data. The training and validation sets contain the ground truth future data for use in model development. In addition, the test and validation sets provide a list of up to 8 object tracks in the scene to be predicted. These are selected to include interesting behavior and a balance of object types.

Dozens of road users navigate a neighborhood in San Francisco. For reference, objects in yellow circles are objects whose tracks should be predicted, the objects in the green circles are the interacting objects.

Data Sampling

Each 9 second sequence in either the training or validation set contains 1 second of history data, 1 sample for the current time, and 8 seconds of future data at 10 Hz sampling. This corresponds to 10 history samples, 1 current time sample, and 80 future samples for a total of 91 samples. The test set hides the ground truth future data for a total of 11 samples (10 history and 1 current time sample).

Coordinate frames

All coordinates in the dataset are in a global frame with X as East, Y as North and Z as up. The origin of the coordinate system changes in each scene. The origin is an arbitrary point and may be far from the objects in the scene. All units are in meters.

Scenario Proto format

An intersection in Los Altos, California with multiple vehicles. For reference, objects in yellow circles are objects whose tracks should be predicted, the objects in the green circles are the interacting objects.

Below is an overview of the Scenario protocol buffer format. Please see the Scenario proto definition for full details.

The scenario proto contains a set of object tracks each containing an object state for each time step in the scenario. It also contains static map features and a set of dynamic map features (e.g. traffic signals) for each time step.

Here is an outline of the proto fields:


For any time step where a state’s valid bit is set to false, there is no measurement of the object state provided in this dataset. Also be aware that there are objects included which have no valid states in the 1 second past history but only have valid states in the future time steps. While these cannot be used for motion prediction, they are included in the dataset for visualization purposes or for research in predicting unseen objects in the future.

tf.Example Proto format

Each tf.Example proto contains the same information as the Scenario protos described above, but all data has been converted to tensors. Please see the tf.Example proto definition for full details.


If you would prefer to jump right in, check out the tutorials here. The Github repo also includes a Quick Start with installation instructions for the Waymo Open Dataset supporting code.