Skip to main content
Back to all posts

October 28, 2021

The Waymo Driver Handbook: Teaching an autonomous vehicle how to perceive and understand the world around it

  • Technology
A sketch of a Waymo Vehicle
A sketch of a Waymo Vehicle

Imagine you are driving through downtown San Francisco and a cyclist, traveling against the flow of traffic, cuts right in front of you. Or maybe you are navigating a narrow, two-way street with a car heading the opposite direction, and you nudge over for them to pass. Perhaps, you are driving late at night when an occluded worker pops out behind a truck, right into the middle of the street. These are just a few examples of the common yet complex scenarios the Waymo Driver encounters in cities like San Francisco.

It’s vital that our fully autonomous driving technology -- the Waymo Driver -- can navigate common situations like these safely. To do so, the Waymo Driver needs advanced technology that allows it to perceive and understand what is happening on and near the road in real time. In this blog, we’ll explain how our perception system makes that possible—creating safe driving experiences for all road users.

Building custom perception systems for the Waymo Driver

The Waymo Driver’s perception system includes two parts: a highly sophisticated, custom suite of sensors we’ve developed specifically for fully autonomous operations, and state-of-the art software to make sense of that information.

It can be argued that sensors are the start of your system’s ability to reason about the world, with significant impacts downstream. Without good sensor data, you won’t always be able to correctly classify objects, estimate their paths or understand their intent, accurately predict their behavior, or make good driving decisions. With advances in machine learning, those advantages actually percolate across your stack. Simply put, good hardware leads to better data which enables smarter software.

The Waymo Driver’s sensors have been engineered and optimized for the sole task of driving. It is designed to react to unexpected behavior from any direction, such as a skater overtaking one of our cars on a busy city street. Our systems can also see objects, like debris, at great distances, to provide the Waymo Driver longer time to react safely especially when driving a Class 8 truck down a highway.

A compilation of lidar videos from the Waymo Driver

First, our family of lidar sensors allows the Waymo Driver to see the world in incredible detail, using light waves to paint rich 3D pictures, known as point clouds, both up close and at long distances. Point clouds capture the size and distance of objects up to 300 meters away in all lighting conditions, so we can spot a pedestrian walking down a dark street on a moonless night an entire city block away. This is especially important since current systems only utilizing camera and radar technology frequently fail to protect pedestrians in dark conditions according to independent testing by the American Automobile Association (AAA).

Rainbow bounding boxes on video footage from the Waymo Driver in San Francisco

Second, our vehicles are also equipped with a range of cameras that provide the Waymo Driver with different perspectives of the road. These cameras can capture long-range and, complementing the rest of our system by adding various sources of information that offer the Waymo Driver an even deeper understanding of its environment.

A video of what the Waymo Driver's radar sees while driving in San Francisco

Finally, we’ve built one of the world’s first radar imaging systems for fully autonomous driving to complement our cameras and lidars. Our radar can instantly perceive a vehicle or pedestrian’s speed and trajectory even in tough weather conditions, such as snow, fog, and rain, providing the Waymo Driver with unparalleled resolution, range and field of view for safe driving. 

Fusing our sensors’ inputs for safety

But our sensors are only half of the story. To take full advantage of the benefits of our fifth-generation hardware, we’ve built our latest-generation software stack by leveraging cutting-edge research in Artificial Intelligence (AI) and Machine Learning (ML). Every major part of our software – whether it’s perception, behavior prediction, or planning – uses advanced ML models that benefit from our unparalleled driving experience from more than 20 million autonomously driven miles and the richness of the data our sensors gather.

For the Waymo Driver to build a coherent picture of its environment, software is needed to turn the data from the sensors into a form the Driver can interpret. Our sensors produce very different types of data—including video footage, fine-grained lidar point clouds, and radar imagery — over different ranges and fields of view. Our sensor diversity means we can use a technique called sensor fusion to improve the confidence of our detections and characterizations.

Sensor fusion allows us to amplify the advantages of each sensor. Lidar, for example, excels at providing depth information and detecting the 3D shape of objects, while cameras are important for picking out visual features, such as the color of a traffic signal or a temporary road sign, especially at longer distances. Meanwhile, radar is highly effective in bad weather and in scenarios when it’s crucial to track moving objects, such as a deer dashing out of a bush and onto the road.

The fusion of high quality sensor information enables the Waymo Driver to operate in a broad range of driving conditions, from busy urban streets to long stretches of highway. Our sensors’ long range is particularly important for safe driving on high-speed freeways, where the speeds involved make it incredibly important to perceive the environment from a great distance. Imagine a Waymo Via truck on a stretch of freeway. From a long distance, the Waymo Driver detects slowing traffic using camera and radar data and begins decelerating. As the truck gets closer, detailed lidar data provides additional information to help the Waymo Driver refine its response, enabling it to respond to traffic congestion at a safe distance.

Sensor fusion is also invaluable in situations that require nuance – such as interpreting other road users’ intentions. For example, fine-grained point clouds with the information from our other sensors leads to smarter machine learning. If our camera system spots a stop sign, lidar can help the Driver reason that it’s actually a reflection in a storefront or an advertising image on the back of a bus.

Finally, as well as interpreting the features of the environment it can perceive, our perception system also makes inferences about what it can’t. For example, if the Waymo Driver sees a passenger vehicle overtake a slow-moving truck on a freeway, in-turn becoming occluded, the Driver can remember that the vehicle might be there before attempting to merge lanes. Our perception system is also aware of its own limits, which means the Waymo Driver does not outdrive what it can see. If we know our sensors cannot see around a blind curve on a narrow country road, this helps our Driver determine that it should slow down and proceed with caution.

The future of perception for fully autonomous driving

Sensor technology is becoming more advanced all the time. Our state-of-the-art fifth-generation Waymo Driver, for instance, allows us to create a safe and comfortable experience in more diverse driving environments, such as urban streets and interstate highways.

We also continue exploring the use of applied machine learning techniques to help reach new frontiers in perception by creating more efficient and accurate detection models for 3D objects at greater distances, utilizing task relations at scale to concurrently and dramatically improve performance and reduce the need for labeled data, and more. These new advances allow us to unlock the full potential of our new sensors, reduce the amount of compute perception uses, and deepen our understanding of other objects on the road.

Perception is one of the most fundamental parts of our technology stack. In future blogs, we’ll explore how the Waymo Driver uses this sophisticated perception information to predict the behavior of other road users and determine the best course of action. Stay tuned!

Join our team and help us build the World’s Most Experienced Driver™. Waymo is looking for talented software and hardware engineers, researchers, and out-of-the-box thinkers to help us tackle real-world problems, and make the roads safer for everyone. Come work with other passionate engineers and world-class researchers on novel and difficult problems—learn more at waymo.com/careers.