FAQ
How can I quickly get started using the dataset?

Check out the Colab tutorial for a walkthrough of the data format. Just open the tutorial notebook in Colab. Please note that the tutorial currently uses some sample frames—it does not access the actual dataset files.

The Quick Start can help you get the codebase installed on your machine locally.

Where can I learn the basics of Colab?

You can learn the basics of Colab with this Welcome notebook.

Where can I learn the basics of TensorFlow?

Google’s Machine Learning Crash Course is the best place to start.

Where can I find the labeling specification?

You can find it in the GitHub repo.

Where can I find the metrics code?

Check out the metrics directory in the GitHub repo.

How can I download all of the dataset files programmatically?

A few days after you register on the site, you’ll be granted access to the Google Cloud Storage bucket containing all of the files. That link is found on the Download page. That should facilitate programmatic downloading as well as allow you easy access from Google Cloud APIs.

How long does it take to get access to the Google Cloud Storage bucket?

It may take up to 2 business days to be granted access. We are working on making this instantaneous.

What are you doing to ensure the privacy of people in the images?

We used an automated method to blur licence plates and faces in our dataset. We then manually confirmed that no unblurred faces or licence plates appear in the dataset.

If you would like to request modification or removal of any images or segments from the Waymo Open Dataset, please fill out the privacy form at https://waymo.com/open/terms. If you have any other privacy concerns, please contact us at open-dataset@waymo.com.

Is this data being offered under an open source license?

The Dataset license has certain limitations around distribution and should not be considered an open source license. If you are interested in a license with different terms, please contact us at open-dataset@waymo.com.

What are some examples of acceptable and unacceptable uses of the dataset under its license?

Example 1, Acceptable Use:

You are a researcher at a university. You use the dataset to perform benchmarking on algorithms you’ve developed. You publish the results. Your published paper includes small extracts of data taken from the dataset for purposes of illustration. Your paper provides the appropriate attribution to the Waymo Open Dataset.

Example 2, Acceptable Use:

You are a researcher at a technology company. You experiment on the dataset using internal systems that are not used to provide a product or service to customers. You develop algorithms, model definitions, and training code as a result of those experiments, and check them into your internal systems with the appropriate attribution to the Waymo Open Dataset. You submit those algorithms and model definitions to a conference for public review. You include the appropriate attribution to the Waymo Open Dataset. Your submission is accepted. You release the training code to let others replicate your results. You are a researcher at another company. You find the publication interesting. You download the dataset from Waymo and train the published model definition against the dataset to see if you can confirm the results.

Example 3, Unacceptable Use:

You are an engineer at an autonomous vehicle company. You use the dataset to train a prototype object detection model for use on a vehicle at a test track. You use this trained model as a placeholder until you build a large enough internal dataset to train your model against.

Example 4, Unacceptable Use:

You train and fine-tune existing models based on the dataset. You use weights and biases from that model and deploy them in a system you intend to use for current or future customers.

Why can’t I publish my trained model?

We want the dataset to be used for Non-Commercial Purposes only, but we want open sharing within the community of registered researchers, so we put in mechanisms to encourage people to join.

First, there are many things that researchers can do with the dataset. For example, researchers are welcome to use the data and publish their findings. Researchers can publish algorithms, model definitions and training code developed using the dataset. Other researchers can replicate the results by downloading the dataset from Waymo and using the dataset for training.

Second, we want to encourage transparency in how models are trained, so we encourage researchers to provide the training instructions with which others can replicate the results by downloading the dataset from Waymo and using the dataset for training.

Third, although you cannot publish trained models, you can share the trained model you develop within the community of registered researchers.

Who should I contact with questions, thoughts, and suggestions on the Open Dataset?

Please reach out to open-dataset@waymo.com.