December 14, 2022
Waymo's Collision Avoidance Testing: Evaluating our Driver’s Ability to Avoid Crashes Compared to Humans
It’s happened to almost every driver: that terrifying moment when you have to urgently brake or swerve to avoid a collision caused by other road users’ behavior. Like a human driver, the Waymo Driver encounters potential hazards — from a vehicle running a red light to a car suddenly changing lanes. To evaluate our Driver’s ability to avoid or mitigate crashes in situations like these, we developed a comprehensive scenario-based testing methodology called Waymo’s Collision Avoidance Testing (CAT). To maintain transparency and provide the public with a deeper understanding of our safety approach, we are publishing a paper to describe how we judge good collision avoidance performance, how we identify the right set of scenarios to test, and the testing tools we’ve developed.
Fully autonomous systems need to handle the entire driving task with no human in the driver’s seat, so they go through much more in-depth testing than driver assist systems. At Waymo, one method we use to evaluate the safety of our Driver is scenario-based testing — a combination of virtual, test-track, and real-world driving. We used it, among many other methods, to help assess safety readiness before removing a human from the driver’s seat in Chandler, Downtown Phoenix, and San Francisco and have been using it ever since to evaluate new software releases for our rider-only fleets.
We evaluate how well the Waymo Driver avoids crashes and mitigates serious injury risk in urgent situations by comparing its behavior to the behavior of a reference model of a non-impaired, with eyes always on the conflict (NIEON) human driver — essentially, an attentive driver who doesn’t get distracted or fatigued* — which we introduced earlier this year. All human drivers take their gaze or attention off the roadway from time to time. So the NIEON model represents a level of performance that doesn’t exist in the human population and provides a high benchmark against which to compare the Waymo Driver.
To identify relevant test scenarios, we use existing driving data from Waymo's many years of experience, human crash data such as police accident databases and crashes recorded by dash cams, and expert knowledge about our operational design domain, which includes geographic areas, driving conditions, and road types where our Driver is going to operate. Over time, we continue to add novel and representative scenarios that we encounter on public roads and in simulations, or as we expand to new territories.
Developed since 2016 and informed by our millions of miles driven on public roads as well as thousands of real-world human crashes, our scenario database provides comprehensive coverage of hazardous situations. Because the most common types of crashes are similar no matter where you drive, our database can be used as a baseline for any city, leading to quicker scalability. It contains a wide range of common situations that can occur almost anywhere — such as a car pulling out from a driveway or a pedestrian crossing against the signal.
Continuous scenario sourcing is facilitated by data collection in new territories where the Waymo Driver operates. For example, our driving in San Francisco and Phoenix allowed us to further expand our dataset with novel interactions with pedestrians. Coupled with the information from public databases on high severity collisions, this data informed the creation of new tests, for example, scenarios where pedestrians break from a crowd into the path of the Waymo Driver.
We use several scenario creation methods. The first and most common one is to stage a scenario on a closed test track (such as our 113-acre closed-course testing facility Castle) and record what happened so that it can be reproduced in simulation. The positions or speeds of other actors can then be modified to make the scenario more likely to require urgent evasive maneuvers to avoid a collision. This allows us to use real vehicles and real “agents,” interacting in a safe manner, while still being able to construct relevant scenarios. Similarly, if a hazardous situation is discovered during on-road testing, we can use the same process to make a simulation and add it as a scenario in the database. For other scenarios that are either too dangerous (e.g., at high speeds) or impractical to collect on a test track (e.g., those requiring specific intersection geometry or highly specialized vehicle types), we create fully synthetic simulations.
We test all scenarios used for the safety evaluation of our latest software releases in simulation — whether derived from test track data, real-world data, or synthetic means. Using simulation allows us to evaluate new versions of the Waymo Driver in thousands of scenarios within hours instead of months or years that would be required to run those same scenarios on a test track or in the real world.
Collectively, the Waymo Driver consistently demonstrates either comparable or better performance than NIEON for collisions generally and, specifically for those involving the potential for serious injury. Previously, we showed that the Waymo Driver avoided more crashes and reduced the injury risk more than NIEON in simulated real-world fatal crashes from our Chandler service area.
In addition to NIEON, we compare the evaluation results to prior software versions and to the performance of other readiness determination methods. This allows us to track performance over time and continuously discover ways to improve the safety of the Waymo Driver.
Waymo’s Collision Avoidance Testing is one of many complementary methods that we use to determine safety readiness for fully autonomous operations, with no one behind the wheel. It is designed to be a flexible and iterative process that can be applied across a range of vehicle platforms, from passenger cars to trucks, and different driving environments, including our next ride-hail city Los Angeles. As we continue to bring the Waymo Driver to more people in more places, this method is an important part of our safety evaluation process. We will continue to share more information about our testing and safety evaluation methods to help the public, regulators, policymakers, and our riders better understand our commitment to safety.
*NIEON is defined by (1) gaze being directed through the windshield toward the forward path during the conflict and (2) a lack of sleepiness and intoxication-related impairment.