Safety leads everything we do at Waymo. This year alone, Waymo has served over 700,000 ride-hailing trips with public riders and no human driver. We couldn’t have hit that milestone without putting safety front and center, and we are working hard to improve the measurement, transparency, and performance of our fleet.
Our comprehensive research — across more than twenty safety papers that we’ve published over the years to enhance transparency — shows that the Waymo Driver performs safely across a range of evaluations. Building on that work, we’ve published two new papers today: one that compares the Waymo Driver’s crash rates to human drivers’ over our 7+ million rider-only miles from Phoenix, San Francisco, and Los Angeles; and another that develops clear human crash benchmarks to enable such comparisons.
Our new research found that Waymo Driver performance led to a significant reduction in the rates of police-reported and injury-causing crashes compared to human drivers in the cities where we operate.
“These reports represent a good-faith effort by Waymo to evaluate how the safety of its autonomous driving system compares with the safety of human driving. The results are encouraging and represent one step in our evolving understanding of autonomous driving safety,” said David Zuby, chief research officer of The Insurance Institute for Highway Safety (IIHS) after reviewing the papers.
How the Waymo Driver compares to humans
There’s no single metric that can show the safety of autonomous driving. We use several different evaluation metrics and methods aiming to get a complete picture of safety. One of many metrics that can be used to monitor the rider-only service after it has launched is the number of vehicle crashes per mile of driving compared to humans.
In the performance study, we compared the Waymo Driver’s crash rates to human drivers’ on several different benchmarks. This study is one of the first to compare overall crash rates using data from fully autonomous operations only, rather than a mix of fully autonomous driving and testing with a human behind the wheel. Unlike the recent research by Swiss Re that focused on crashes resulting in Waymo’s liability claims, this study includes all Waymo crashes, regardless of the Waymo vehicle’s role in the crash, and with any amount of property damage. It also uses publicly available data, which allows other researchers to replicate the results.
Waymo’s data was derived from crashes reported under NHTSA’s Standing General Order (SGO), over 7.14 million fully autonomous miles driven 24/7 through the end of October 2023 across Phoenix, San Francisco, and Los Angeles. That data was then compared to relevant human crash rates resulting in police reports, injuries, and/or property damage.
When considering all locations together, compared to the human benchmarks, the Waymo Driver demonstrated:
An 85% reduction or 6.8 times lower crash rate involving any injury, from minor to severe and fatal cases (0.41 incidence per million miles for the Waymo Driver vs 2.78 for the human benchmark)
A 57% reduction or 2.3 times lower police-reported crash rate (2.1 incidence per million miles for the Waymo Driver vs. 4.85 for the human benchmark)
This means that over the 7.1 million miles Waymo drove, there were an estimated 17 fewer injuries and 20 fewer police-reported crashes compared to if human drivers with the benchmark crash rate would have driven the same distance in the areas we operate.
When compared to crash rates in San Francisco, Phoenix, and Los Angeles, individually, the Waymo Driver significantly outperformed local respective human benchmarks as well (though the comparison in LA does not yet have enough mileage to be statistically significant). Notably, local human benchmarks varied from one city to another — for example, San Francisco had the highest rate of crashes where an injury was reported with 5.55 incidents per million miles, which was approximately three times higher than the national average.
In general, the Waymo Driver also demonstrated lower property damage rates compared to the human benchmarks. However, the benchmark rates themselves varied considerably depending on the human data source, even within the same location, so caution should be taken when interpreting these results.
Generating a valid comparison of human and AV performance
Determining valid and comparable human crash benchmarks is key for understanding autonomous driving technology’s performance. Despite public accessibility of both human and autonomous vehicle (AV) crash data, comparing the two comes with its share of challenges. Our benchmarking paper aims to ensure a fair comparison between AV and human driving by addressing the most common errors and biases and establishing valid benchmarks from the cities in which Waymo operates.
There are two important sources of statistical bias to control for when comparing human and autonomous driving. The first is human crash underreporting. While the data on human crashes that lead to injuries or property damage is fairly robust, a large number of human low-severity crashes — like hitting some road debris or minor “fender benders” — are not reported to police. In contrast, AV companies report even the most minor crashes in order to demonstrate the trustworthiness of autonomous driving on public roads. For example, only 21% of crashes that Waymo has reported to NHTSA to date have resulted in a filed police report, regardless of the party at fault.
The second is differences in driving conditions and/or vehicle characteristics. Public human crash data includes all road types, like freeways, where the Waymo Driver currently only operates with an autonomous specialist behind the wheel, as well as various vehicle types from commercial heavy vehicles to passenger and motorcycles.
These differences mean that adjustments need to be made to human crash data before comparing it to AV crash rates. Just as you wouldn’t compare cycling in a road-race to mountain biking without accounting for differences in altitude, terrain, and other factors, comparisons between human and autonomous driving need to account for differences too.
To make sure the human crash data is valid and comparable, the injury and property damage benchmarks in our 7.1 million mile study either included underreporting adjustments for police-reported crashes or were derived from naturalistic driving study databases, which equip vehicles with sensors to record driving and are more likely to catch low-severity crashes. The methodology we used was based on best practices from a literature review of 12 past studies and 1 book comparing AV and human crash rates.
At Waymo, we believe that international standardization for valid analysis of AV crash data is needed. We’re looking for opportunities to engage the community of traffic safety experts and standards organizations to advance this work in the future.
Our comprehensive approach to safety
While being safer than human drivers in aggregate is an important aspect of AV safety, it is not sufficient on its own. The assessment of crash rates is just one among many methods used to evaluate the Waymo Driver’s performance, akin to how a routine health checkup involves examining various aspects of physical well-being.
Our approach goes beyond safety metrics alone. Good driving behavior matters, too — driving respectfully around other road users in reliable, predictable ways, not causing unnecessary traffic or confusion. We’re working hard to continuously improve our driving behavior across the board.
Through these studies, our goal is to provide the latest results on our safety performance to the general public, enhance transparency in the AV industry, and enable the community of researchers, regulators, and academics studying AV safety to advance the field.
You can read our two latest safety research papers and learn more about our overall approach to safety here.