Comparing autonomous vehicle and human performance

__Despite the public availability of crash data for both human-driven and autonomous vehicles, drawing meaningful comparisons between the two is challenging.__ To ensure a fair comparison, there’s a number of factors that should be taken into consideration. Here are some of the most important: - AV and human data have different definitions of a crash. AV operators like Waymo must report any physical contact that results or allegedly results in any property damage, injury, or fatality, while most human crash data require at least enough damage for the police to file a collision report. - Not all human crashes are reported. NHTSA estimates that 60% of property damage crashes and 32% of injury crashes aren’t reported to police ([Blincoe et al. 2023](https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/813403)). In contrast, AV companies report even the most minor crashes in order to demonstrate the trustworthiness of autonomous driving on public roads. - Focus should be put on injury-causing crashes. Low speed crashes that result in minor damage can cause property damage that can be quickly repaired. These low speed crashes are also the most frequent types of crashes. In traffic safety, the most emphasis is put on reducing the highest severity crashes that can result in injuries. - It’s important to look at rates of events (incidents per mile) instead of absolute counts. Waymo is growing its operations in the cities we operate in. With more driving miles come more absolute collisions. It’s critical to consider the total miles driven to accurately calculate incident rates. If you do not consider the miles driven, it may appear like incidents are increasing while in reality the rate of incidents could be going down. - All streets within a city are not equally challenging. Waymo’s operations have expanded over time, and, because Waymo operates as a ride-hailing service, the driving mix largely reflects user demand. The results on this data hub show human benchmarks reported in [Scanlon et al. (2024)](https://arxiv.org/abs/2312.13228) and extended upon in [Kusano et al.](https://www.tandfonline.com/doi/full/10.1080/15389588.2025.2499887) (2025) that are adjusted to account for differences in driving mix using a method described by [Chen et al. (2024)](https://arxiv.org/abs/2410.08903). See the "Human Benchmarks" section below for more details. Waymo has used industry best-practices to make a fair comparison between AV and human data sources that is presented on this webpage. This analysis is described more below, and in even more depth in several of Waymo’s safety publications.

The human benchmark data are the same as reported in [Scanlon et al.](https://arxiv.org/abs/2312.13228) (2024), and extended upon in [Kusano et al.](https://www.tandfonline.com/doi/full/10.1080/15389588.2025.2499887) (2025). These benchmarks are derived from state police reported crash records and Vehicle Miles Traveled (VMT) data in the areas Waymo currently operates RO services at large scale (Phoenix, San Francisco, Los Angeles, and Austin). The human benchmarks were made in a way that only included the crashes and VMT corresponding to passenger vehicles traveling on the types of roadways Waymo operates on (excluding freeways). The any-injury-reported benchmark also used a 32% underreporting correction (based on [NHTSA’s Blincoe et al., 2023 study](https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/813403) to adjust for crashes not reported by humans. The serious injury or worse (referred to as “suspected serious injury+” in the papers) and airbag deployment human benchmarks rates used the observed crashes without an underreporting correction. All streets within a city are not equally challenging. If Waymo drives more frequently in more challenging parts of the city that have higher crash rates, it may affect crash rates compared to quieter areas. The benchmarks reported by [Scanlon et al.](https://arxiv.org/abs/2312.13228) are at a city level, not for specific streets or areas. The human benchmarks shown on this data hub were adjusted using a method described by [Chen et al.](https://arxiv.org/abs/2410.08903) (2024) that models the effect of spatial distribution on crash risk. The methodology adjusts the city-level benchmarks to account for the unique driving distribution of the Waymo driving. The result of the reweighting method is human benchmarks that are more representative of the areas of the city Waymo drives in the most, which improves data alignment between the Waymo and human crash data. Achieving the best possible data alignment, given the limitations of the available data, are part of the newly published Retrospective Automated Vehicle Evaluation (RAVE) best practices ([Scanlon et al., 2024b](https://arxiv.org/abs/2408.07758)). This spatial dynamic benchmark approach described by [Chen et al.](https://arxiv.org/abs/2410.08903) (2024) was also used in [Kusano et al.](https://www.arxiv.org/pdf/2505.01515) (2025).

Confidence intervals and data limitations

Confidence intervals for Incidents Per Million Miles (IPMM) crash rates were computed using a Poisson Exact method. The confidence intervals for the percent reduction used a Clopper-Pearson binomial described in Nelson (1970). Both confidence intervals were assessed at a 95% confidence level. These confidence intervals use the same methods as described in [Kusano et al.](https://www.tandfonline.com/doi/full/10.1080/15389588.2024.2380786) (2023). There is no perfect “apples-to-apples” comparison between human and AV data available today. The benchmarks and comparisons done on this page represent the current state-of-the-art human and AV data sources, based on the state of the art in the research in this field. The serious injury or worse and airbag deployment benchmarks do not have an underreporting correction for the human data because there is no estimate for airbag crash underreporting. Although, it is likely there is more underreporting in human crash data compared to AV crash data. The any-injury-reported benchmark does use an underreporting correction from [Blincoe et al.](https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/813403) (2023) based on multiple analyses of national crash police-report and insurance data and a national phone survey. It is not straightforward to compute confidence intervals on the any-injury-reported underreporting estimate because it is derived from multiple sources. There is also evidence that underreporting may differ between localities, meaning a national estimate may not fully represent underreporting in the cities Waymo operates in. See Scanlon et al. (2024) and Kusano et al. (2024) for a more comprehensive discussion of the limitations of these results: 1. Scanlon, J. M., Kusano, K. D., Fraade-Blanar, L. A., McMurry, T. L., Chen, Y. H., & Victor, T. (2024). [Benchmarks for Retrospective Automated Driving System Crash Rate Analysis Using Police-Reported Crash Data](https://arxiv.org/pdf/2312.13228). Traffic Injury Prevention, 25(sup1), S51-S65. 2. Kusano, K. D., Scanlon, J. M., Chen, Y. H., McMurry, T. L., Chen, R., Gode, T., & Victor, T. (2024). [Comparison of Waymo Rider-only crash data to human benchmarks at 7.1 million miles](https://www.tandfonline.com/doi/full/10.1080/15389588.2024.2380786). Traffic Injury Prevention, 25(sup1), S66-S77.

Waymo Safety Impact

Making roads safer

The trust and safety of the communities where we operate is paramount to us. That’s why we’re voluntarily sharing our safety data.

The data to date indicate the Waymo Driver is already making roads safer in the places where we currently operate. Specifically, the data below demonstrate that the Waymo Driver is better than humans at avoiding crashes that result in injuries — both of any severity and specifically serious ones — as well as those that lead to airbag deployments.

This hub compares the Waymo Driver’s Rider-Only (RO) crash rates to human crash benchmarks for surface streets. It leverages best practices in safety impact analysis and builds upon dozens of Waymo’s safety publications, providing an unprecedented level of transparency within the autonomous driving industry. By sharing our data and methodologies, we also invite you to join us as we push for advancements in measuring safety impact.

The data displayed on this webpage undergo consistent updates aligned with NHTSA’s Standing General Order (SGO) reporting timelines.

How the Waymo Driver compares to humans

Rider-only (RO) miles driven

Through March 2026, Waymo has driven 220.6M rider-only miles without a human driver

The Waymo Driver has tens of millions miles of real-world driving experience. This dashboard shows rider-only miles – miles that Waymo has driven without a human driver — in cities where we operate our ride-hailing service, Waymo.

Learn about our methodology

位置	RO Miles through March 2026
洛杉磯	51.816M
舊金山灣區	67.078M
鳳凰城	80.551M
奧斯丁	15.789M
Atlanta	5.379M

Waymo Driver compared to human benchmarks

This table shows how many fewer RO crashes Waymo had (regardless of who was at fault) compared to human drivers with the average benchmark crash rate if they were to drive the same distance in the areas we operate. Results have been rounded to the nearest whole number.

Learn about our methodology

在我們提供服務的城市中，以相同距離的行駛過程，針對 Waymo Driver 與一般真人駕駛做比較：

整體當機次數減少

94% 重傷人數或較嚴重的車禍數量降低 (47 減少)

82% 安全氣囊展開的車禍數量降低 (305 減少)

82% 造成人員傷亡的車禍數量降低 (707 減少)

弱勢道路使用者受傷的車禍數量減少

93% 行人受傷的車禍數量減少 (76 減少)

84% 自行車騎士受傷的車禍數量減少 (48 減少)

84% 機車傷亡車禍數量減少 (32 減少)

Waymo Driver compared to human benchmarks

Injuries and airbag deployments

The graphs below show how many fewer incidents (crashes) per million miles (IPMM) Waymo had compared to human drivers with the benchmark crash rate. The error bars represent 95% confidence intervals for the IPMM estimate.

The reductions are shown for all locations combined and separately for individual cities.

Only data from cities with sufficient Waymo miles for statistical comparisons are shown in these charts.

Learn about our methodology

Serious Injury or Worse Crash Rates

Location	Incidents per Million Miles (IPMM), Waymo	Incidents per Million Miles (IPMM), Benchmark
All Locations	0.01	0.23
Phoenix	0.01	0.12
San Francisco	0.03	0.44
Los Angeles	0.00	0.15
Austin	0.00	0.17
Atlanta Area	0.00	0.23

Any-Injury-Reported Crash Rates

Location	Incidents per Million Miles (IPMM), Waymo	Incidents per Million Miles (IPMM), Benchmark
All Locations	0.71	3.91
Phoenix	0.57	2.03
San Francisco	0.70	7.25
Los Angeles	0.91	2.42
Austin	0.70	3.35
Atlanta Area	0.93	6.60

Airbag Deployment in Any Vehicle Crash Rates

Location	Incidents per Million Miles (IPMM), Waymo	Incidents per Million Miles (IPMM), Benchmark
All Locations	0.30	1.68
Phoenix	0.32	1.38
San Francisco	0.31	2.13
Los Angeles	0.27	1.19
Austin	0.25	2.53
Atlanta Area	0.19	2.99

Waymo 車輛安全氣囊作動事故率

Location	Incidents per Million Miles (IPMM), Waymo	Incidents per Million Miles (IPMM), Benchmark
All Locations	0.06	1.11
Phoenix	0.09	0.96
San Francisco	0.07	1.24
Los Angeles	0.00	0.96
Austin	0.06	2.13

Waymo Driver compared to human benchmarks

Percent difference in crash rate

The graphs below show the percent difference between the Waymo and human benchmark crash rates by location, with 95% confidence intervals. A negative number means the Waymo Driver reduced crashes compared to the human driver. Confidence intervals that do not cross 0% mean the percent difference is statistically significant.

The percent reductions and confidence intervals show that the Waymo Driver has a large, statistically significant, reduction in crash rates compared to the human benchmark across many outcomes and locations.

Only data from cities with sufficient Waymo miles for statistical comparisons are shown in these charts.

Learn about our methodology

Waymo crash rate percent difference to benchmark

Location	Percent Difference to Benchmark, Airbag Deployment in Any Vehicle	Percent Difference to Benchmark, Airbag Deployment in Waymo Vehicle	Percent Difference to Benchmark, Any Injury Reported	Percent Difference to Benchmark, Serious Injury or Worse
All Locations	-82.23%	-94.68%	-81.93%	-94.05%
Phoenix	-76.57%	-90.96%	-71.85%	-89.50%
San Francisco	-85.28%	-94.00%	-90.34%	-93.15%
Los Angeles	-77.28%	-100.00%	-62.44%	-100.00%
Austin	-89.99%	-97.03%	-79.22%	-100.00%
Atlanta Area	-93.79%	N/A	-85.91%	-100.00%

Percent of Waymo Driver collisions with <1mph change in velocity

Delta-V <1mph

Delta-V measures the change in velocity during a collision. It is another way to investigate crash severity and is one of the most important predictors of injury risk in vehicle-to-vehicle crashes.

This graph shows the percentage of SGO-reported crashes where the maximum Delta-V (from either the Waymo vehicle or other vehicle) was less than 1 mph—meaning the collision resulted in a <1mph change in velocity. A Delta-V less than 1 mph usually results in only minor damage (dents and scratches). This graph includes vehicle-to-vehicle and single vehicle crashes, but not crashes with pedestrians, cyclists, and motorcyclists.

Delta-V is estimated using an impulse-momentum crash model with inputs measured by the Waymo vehicle’s sensor system. Note: Comparable human benchmarks for <1mph Delta-V are currently not possible to estimate with high certainty.

Learn about our methodology

% of SGO Collisions with less than 1mph change in velocity (Delta-V <1mph)

Location	% Crashes <1 mph Delta-v
ALL AREAS	41%
SF	44%
PHX	38%
LA	39%
ATL	49%
ATX	36%

Waymo Driver 與人類駕駛基準比較：依事故類型

以下圖表呈現 Waymo 無人駕駛模式 (RO) 事故數 (不論肇責歸屬) 相較於人類駕駛的落差幅度。此處的人類駕駛數據，是依據人類在 Waymo 營運地區行駛相同距離的平均基準事故率所推算。事故共分為 11 種類型，且能代表所有地區的情況。如要查看各城市的資料，請前往下載專區。標註百分比之長條，代表其差異具統計顯著性。

Learn about our methodology

任一車輛安全氣囊作動的事故

Crash Type Group	Events (Benchmark)	Events (Waymo)
V2V LATERAL	19	1 (-95%)
V2V INTERSECTION	206	11 (-95%)
V2V HEAD-ON	11	8
V2V F2R	48	28 (-42%)
SINGLE VEHICLE	45	0 (-100%)
SECONDARY CRASH	25	17
ALL OTHERS	8	1 (-88%)

人員受傷通報的事故

Crash Type Group	Events (Benchmark)	Events (Waymo)
V2V LATERAL	57	11 (-81%)
V2V INTERSECTION	340	13 (-96%)
V2V F2R	139	73 (-48%)
SINGLE VEHICLE	60	2 (-97%)
SECONDARY CRASH	46	19 (-59%)
PEDESTRIAN	81	6 (-93%)
MOTORCYCLE	38	6 (-84%)
CYCLIST	56	9 (-84%)
ALL OTHERS	17	3 (-83%)

Waymo Safety Research Partners

By making detailed information about crashes and miles driven publicly accessible, Waymo’s transparency will not only support independent research but foster public trust. We hope other companies developing and deploying automated driving systems follow suit.

David Zuby, Chief Research Officer, Insurance Institute for Highway Safety (IIHS)

Waymo’s new data-sharing hub helps move AV safety science forward by enabling those outside the company to conduct independent analyses of Waymo’s performance data. Better safety benefits everyone, and enabling others to make their own assessments brings more ideas and wisdom to the AV table.

Carol A. Flannagan, Ph.D., Research Professor, University of Michigan Transportation Research Institute (UMTRI)

In our research, we have been fortunate to see up-close how strong Waymo’s performance data has been. Data accessibility is the next frontier in risk and safety research, and ensuring the accurate interpretation of this data will be critical. To support data transparency, we continue to advance the development of risk assessment methods to maintain accuracy and contribute to the safe adoption of autonomous vehicles.

Orsolya Hegedus, Head Automotive & Mobility Solutions, Swiss Re

Methodology

Comparing autonomous vehicle and human performance
Despite the public availability of crash data for both human-driven and autonomous vehicles, drawing meaningful comparisons between the two is challenging. To ensure a fair comparison, there’s a number of factors that should be taken into consideration. Here are some of the most important:
- AV and human data have different definitions of a crash. AV operators like Waymo must report any physical contact that results or allegedly results in any property damage, injury, or fatality, while most human crash data require at least enough damage for the police to file a collision report.
- Not all human crashes are reported. NHTSA estimates that 60% of property damage crashes and 32% of injury crashes aren’t reported to police (Blincoe et al. 2023). In contrast, AV companies report even the most minor crashes in order to demonstrate the trustworthiness of autonomous driving on public roads.
- Focus should be put on injury-causing crashes. Low speed crashes that result in minor damage can cause property damage that can be quickly repaired. These low speed crashes are also the most frequent types of crashes. In traffic safety, the most emphasis is put on reducing the highest severity crashes that can result in injuries.
- It’s important to look at rates of events (incidents per mile) instead of absolute counts. Waymo is growing its operations in the cities we operate in. With more driving miles come more absolute collisions. It’s critical to consider the total miles driven to accurately calculate incident rates. If you do not consider the miles driven, it may appear like incidents are increasing while in reality the rate of incidents could be going down.
- All streets within a city are not equally challenging. Waymo’s operations have expanded over time, and, because Waymo operates as a ride-hailing service, the driving mix largely reflects user demand. The results on this data hub show human benchmarks reported in Scanlon et al. (2024) and extended upon in Kusano et al. (2025) that are adjusted to account for differences in driving mix using a method described by Chen et al. (2024). See the “Human Benchmarks” section below for more details.
Waymo has used industry best-practices to make a fair comparison between AV and human data sources that is presented on this webpage. This analysis is described more below, and in even more depth in several of Waymo’s safety publications.

How we select Waymo incidents noted in this hub

Waymo’s data is derived from crashes reported under NHTSA’s Standing General Order (SGO) and uses the same criteria as described in Kusano et al. (2024) and Kusano et al. (2025).

We are intentionally using publicly available data to allow other researchers to replicate the results. To link the data shown on this dashboard to NHTSA’s published SGO data, researchers can download a list of SGO report IDs and boolean membership in each outcome group in the download section below. Comparisons of crash rates for the outcomes listed below and additional outcomes described in the release notes are also available for download. The “serious injury or worse” outcome uses police reports requested through public records requests. If a police report cannot be obtained by the time of publication, then such cases are marked as “unknown” status for the “serious injury or worse” category.

We compare Waymo’s crash rate to human benchmarks across several different types of crashes:

Outcome	Description	Waymo Data*	Human Benchmark
Any-injury-reported	A crash where any road user is injured as a result of the crash	Any SGO reported crash with the field “Highest Injury Severity Alleged” is “Minor”, “Moderate”, or “Serious”, or “Fatality”). “Unknown” reported severity where the SGO narrative mentions injuries of unknown severity are also included.	Police-reported crashed vehicle rate where at least one road user had a reported injury. A 32% underreporting adjustment was applied according to Blincoe et al (2023).
Airbag deployment in Any Vehicle	A crash where an airbag deploys in any vehicle involved in the crash	Any SGO reported crash where the “Any Air Bags Deployed?” is “Yes” for either the subject vehicle (SV) or counter party (CP). Additionally, crashes are included in this category when a review of relevant data (e.g., video) finds an airbag deployed in a third party.	Police-reported crashed vehicle rate where any vehicle involved in the crash had an airbag deployment. No underreporting adjustment was applied.
Airbag deployment in Waymo Vehicle	A crash where an airbag deploys in the Waymo vehicle involved in the crash	Any SGO reported crash where the “Any Air Bags Deployed?” is “Yes” for the subject vehicle (SV).	Police-reported crashed vehicle rate where airbag deployment occurred in the vehicle. No underreporting adjustment was applied.
Serious injury or worse	A crash where any road user is seriously injured or killed as a result of the crash	Police reports were requested through public information requests for any SGO crash with “Highest Injury Severity Alleged” as “Serious” or “Fatality” for the field “Highest Injury Severity Alleged.” The SGO crash was included if the police report indicated any person in the crash had an “incapacitating” (“A”) or “killed” (“K”) injury severity.	Police-reported crashed vehicle rate where any person in the crash had a police-reported injury of “incapacitating” (“A”) or “killed” (“K”). No underreporting adjustment was applied.

*Based on initial data submitted as part of the NHTSA Standing General Order 2021-01

Human benchmarks
The human benchmark data are the same as reported in Scanlon et al. (2024), and extended upon in Kusano et al. (2025). These benchmarks are derived from state police reported crash records and Vehicle Miles Traveled (VMT) data in the areas Waymo currently operates RO services at large scale (Phoenix, San Francisco, Los Angeles, and Austin). The human benchmarks were made in a way that only included the crashes and VMT corresponding to passenger vehicles traveling on the types of roadways Waymo operates on (excluding freeways). The any-injury-reported benchmark also used a 32% underreporting correction (based on NHTSA’s Blincoe et al., 2023 study to adjust for crashes not reported by humans. The serious injury or worse (referred to as “suspected serious injury+” in the papers) and airbag deployment human benchmarks rates used the observed crashes without an underreporting correction.

All streets within a city are not equally challenging. If Waymo drives more frequently in more challenging parts of the city that have higher crash rates, it may affect crash rates compared to quieter areas. The benchmarks reported by Scanlon et al. are at a city level, not for specific streets or areas. The human benchmarks shown on this data hub were adjusted using a method described by Chen et al. (2024) that models the effect of spatial distribution on crash risk. The methodology adjusts the city-level benchmarks to account for the unique driving distribution of the Waymo driving. The result of the reweighting method is human benchmarks that are more representative of the areas of the city Waymo drives in the most, which improves data alignment between the Waymo and human crash data. Achieving the best possible data alignment, given the limitations of the available data, are part of the newly published Retrospective Automated Vehicle Evaluation (RAVE) best practices (Scanlon et al., 2024b). This spatial dynamic benchmark approach described by Chen et al. (2024) was also used in Kusano et al. (2025).
Confidence intervals and data limitations
Confidence intervals for Incidents Per Million Miles (IPMM) crash rates were computed using a Poisson Exact method. The confidence intervals for the percent reduction used a Clopper-Pearson binomial described in Nelson (1970). Both confidence intervals were assessed at a 95% confidence level. These confidence intervals use the same methods as described in Kusano et al. (2023).

There is no perfect “apples-to-apples” comparison between human and AV data available today. The benchmarks and comparisons done on this page represent the current state-of-the-art human and AV data sources, based on the state of the art in the research in this field. The serious injury or worse and airbag deployment benchmarks do not have an underreporting correction for the human data because there is no estimate for airbag crash underreporting. Although, it is likely there is more underreporting in human crash data compared to AV crash data. The any-injury-reported benchmark does use an underreporting correction from Blincoe et al. (2023) based on multiple analyses of national crash police-report and insurance data and a national phone survey. It is not straightforward to compute confidence intervals on the any-injury-reported underreporting estimate because it is derived from multiple sources. There is also evidence that underreporting may differ between localities, meaning a national estimate may not fully represent underreporting in the cities Waymo operates in.

See Scanlon et al. (2024) and Kusano et al. (2024) for a more comprehensive discussion of the limitations of these results:
1. Scanlon, J. M., Kusano, K. D., Fraade-Blanar, L. A., McMurry, T. L., Chen, Y. H., & Victor, T. (2024). Benchmarks for Retrospective Automated Driving System Crash Rate Analysis Using Police-Reported Crash Data. Traffic Injury Prevention, 25(sup1), S51-S65.
2. Kusano, K. D., Scanlon, J. M., Chen, Y. H., McMurry, T. L., Chen, R., Gode, T., & Victor, T. (2024). Comparison of Waymo Rider-only crash data to human benchmarks at 7.1 million miles. Traffic Injury Prevention, 25(sup1), S66-S77.

常見問題

1. 研究結果是否可信？
- 1.1. 安全影響研究是否在相同基準上，公平地比較 Waymo 與人類駕駛？
  - 1.1.1. 安全影響研究的設計與執行方式為何？
    雖然事故率比較的核心關鍵在於 4 項簡單數據，即自動駕駛系統 (ADS) 與基準各自的事故數和里程數，但研究設計與資料來源選用上的諸多取捨，都可能影響最終結果。安全影響分析長期做為車輛安全研究領域的主流工具，其歷史可追溯至電子車身穩定系統與自動緊急煞車等安全技術取得進展的年代。由於 ADS 須負責完整的動態駕駛任務，因此 ADS 的安全影響研究有其獨特挑戰；為此，RAVE 檢查清單彙整了業界共識，提供相關的最佳實務準則。這份檢查清單正發展為國際標準，規範 ADS 安全影響研究的最佳做法。構成安全影響資料中心基礎的各項研究，在設計上均符合 RAVE 檢查清單 (請參閱 Kusano 等人 (2025 年) 論文的線上版附錄，內含研究方法與 RAVE 檢查清單的符合性評估)。
  - 1.1.2. Waymo 會通報所有事故嗎？
    Waymo 的安全影響研究是基於美國國家公路交通安全管理局 (NHTSA) 常設一般命令 (SGO) 的通報資料。包括 Waymo 在內，所有自動駕駛系統 (ADS) 營運商 (也就是像 Waymo 這樣的自動駕駛車輛業者) 都必須遵守 SGO，並在指定的通報期限內，通報所有符合條件的事故。如果 NHTSA 認為 ADS 營運商的 SGO 通報內容不一致，有權展開調查並採取修正措施。SGO 通報規定包括輕微財損事故，通報門檻比傳統警方通報或保險事故資料庫更低 (意即包含更多輕微事故)。凡涉及人員受傷或安全氣囊作動的事故 (這也是安全影響資料中心的研究重點)，均須根據 SGO 規定進行通報。因此，考量到 Waymo 車隊嚴格的通報要求與營運政策，資料中心不太可能遺漏任何符合通報標準的事故。NHTSA 報告 (Blincoe 等人 (2023 年)) 指出，在人類駕駛車輛的事故中，財物毀損事故的低報率為 69.7%，受傷事故的低報率則為 31.9%。相對地，Waymo 的通報涵蓋了由精密感測器系統偵測到的所有已知事故，因此記錄更為完整。
    由於 Waymo 使用警方通報的資料來制定基準，因此與基準進行比較時，只會納入 Waymo 車輛發生實體碰撞且需被拖離現場的事故。在警方通報的事故資料中，並不會列入未在事故過程中發生實體碰撞的車輛。因此，若將 SGO 通報中未發生實體碰撞的 Waymo 事故納入比較 (可能因 Waymo 車輛疑似造成事故而須通報)，與基準值比較時，Waymo 事故率可能會被高估。同樣地，Waymo 車輛有時會在待命期間停在合法車位 (如劃設車位或路緣 18 英吋內)，此時車輛處於停車檔且 ADS 軟體仍在運行。在警方通報資料中，這類停放車輛也不會計入車輛總數 (停放車輛被視為固定物)。
  - 1.1.3. Waymo 與人類駕駛資料統計的事故結果是否相同？
    若要公平比較自動駕駛系統 (ADS) 和人類駕駛的事故資料，最關鍵的維度之一便是將資料對齊，而對齊資料的重要步驟，則是為「事故」建立一致的定義。Waymo 的安全影響研究基於過去的安全評估研究，選出無論 ADS 或人類資料來源皆容易辨識的事故結果。警方通報資料庫是最常見且可靠的人類事故資料來源。不過，並非所有人類駕駛的事故都會通報警方，尤其是輕微事故。與僅造成少量財產損失的事故相比，導致安全氣囊彈出或造成人員受傷（無論是重傷還是更嚴重的傷害，或任何程度的傷害）的更嚴重的事故與評估安全性更相關。
    儘管我們認為，與造成少量財產損失的事故相比，造成嚴重傷害或更嚴重後果的事故、安全氣囊展開以及任何傷害報告的結果與評估安全性更相關，但我們仍然跟踪並報告這些輕微碰撞率，並將其與數據中心網站下載部分提供的基準進行比較（例如，任何財產損失或傷害以及警方報告的事故）。
  - 1.1.4. 這項比較是否將天氣等不同駕駛條件納入考量？
    Waymo 安全影響研究採用多種方法，將人類基準與 Waymo Driver 的駕駛條件對齊：(a) 使用 Waymo 營運所在郡的人類駕駛資料，以及 (b) 根據位置動態調整基準。不同城市的駕駛環境不一樣，個別道路或駕駛條件的風險程度也不盡相同。為了掌握當地事故風險，Waymo 安全影響研究使用州政府維護的事故和車輛行駛里程 (VMT) 資料來源，且將資料範圍限縮在 Waymo 目前營運的郡。即使在同一個郡，人類駕駛的事故率也會因區域而異。一般來說，城市人口密度較高的區域，事故發生率會高於人口密度較低的區域。為反映這項影響，Waymo 的安全影響研究會對基準進行動態調整，亦即根據 Waymo 服務在各區域的行駛里程，按比例加權人類基準 (詳情請參閱 Kusano 等人 (2025 年) 和 Chen 等人即可瞭解詳情。透過比較相同地點的 Waymo 與基準駕駛情況，許多駕駛條件的影響已自然地納入考量。我們的研究表明，事故率因地理位置而異，因此我們不建議使用全國平均基準來與 Waymo 的駕駛進行比較。
    透過本地事故數據和動態調整，使基準事故率更好地與 Waymo 駕駛環境相匹配，可以解釋許多（但並非全部）可能影響事故風險的因素。例如，Waymo 目前營運的城市並無明顯降雪，因此 Waymo 和人類基準數據均未包含這類惡劣天氣。此外，Chen 等人 (2025) 的研究指出，時段是影響事故率的關鍵 (深夜事故率通常高於白天)。然而，在對齊基準與 Waymo 數據時，往往受限於人類駕駛的風險暴露數據不足，而難以將時段等更多變因納入考量。舉例來說，用於計算動態基準的 VMT 數據採用年平均值，因此無法反映不同時段的差異。我們正在研究其他資料來源，希望能取得更多人類駕駛數據，使基準與 Waymo 數據進一步對齊。
  - 1.1.5. 為什麼這項比較是採用 Waymo 營運地區的所有人類駕駛做為基準？
    安全影響資料中心提供的結果，是採用最佳做法對齊 Waymo 與人類的事故資料，藉此比較 Waymo 與其營運地區目前人類駕駛的事故表現。這項比較回答了「Waymo 的行駛對現狀有何影響？」這一研究課題。在開發與部署新車用技術 (如自動緊急煞車、電子車身穩定系統) 時，這是研究人員最常提出的基本問題。這種現狀比較顯示了車輛技術在提高交通安全方面的潛力。
    Waymo 的其他一些研究也調查了與其他族群的比較。舉例來說，在先前的研究和 Waymo 的前瞻性安全判定方法中，我們為了評估防撞系統的成效，直接比較了 Waymo Driver 與「正常狀態且注視危險路況 (NIEON)」駕駛的表現。為了建立具可比性的事故率基準，我們在方法論上面臨了許多挑戰；由於人類駕駛開車時並非始終處於 NIEON 狀態，因此難以取得這類駕駛人的確切 VMT 來建立量化基準。Swiss Re 與 Waymo 合作的另一項研究 (正在接受同儕審查)，則比較了 Waymo 與駕駛最新一代車輛的人類，兩者的第三方索賠率。這代表了人類駕駛車輛中另一個性能更高的子集，因為最新一代的車輛通常具有改進的安全功能。
    另一個可能具有啟發意義的比較對像是其他駕駛群體，例如計程車或叫車司機。不過，目前尚無公開且可供獨立驗證的資料來源，能像一般警方通報資料和公開 VMT 資料庫那樣，針對這些特殊群體量化涵蓋多種事故結果的事故數與 VMT。正常狀態駕駛人的數據，也可做為代表進階預期目標的基準。這項比較基準雖然有其價值，卻無法評估相對於現行事故率的減幅。如同特殊群體數據所面臨的挑戰，要估算特定地區內受影響駕駛的事故次數與 VMT 總數是非常困難的。隨著新資料來源的出現，這些兼具挑戰和價值的領域值得進一步的研究。
  - 1.1.6. 為什麼 Waymo 會取整起事故過程中的最嚴重傷勢做為統計基準？
    傷害結果可以透過多種不同方式來衡量。我們不希望分析過度著重於 Waymo 車內的乘員，因為這可能會低估 Waymo 在涉及車外人員受傷事故時的安全影響。因此，我們選擇以事故層級來評估結果，記錄整起事故過程中任何人員所受的最嚴重傷勢。在汽車安全研究中，採用事故層級的最高傷害分數是常見的做法，而且警方報告通常直接設有記錄此項資料的欄位。
- 1.2. Waymo 是否累積足夠的行駛里程，能夠得出具統計顯著性的可靠結論？
  Waymo 的行駛里程數 (數億英里) 看似遠不及 Waymo 服務城市的總行駛里程數 (數十億英里)，或美國全境的年度總行駛里程數 (數兆英里)，不過，比較兩個母體的事故發生率時，得出的結論取決於所謂的統計檢定力。安全影響資料中心要回答的問題是：Waymo 和基準的事故率是否不同？此計算的輸入是 Waymo 和基準人口的事故數量和行駛里程數，並使用泊松分佈進行建模，泊松分佈是處理計數資料最常用的分佈。
  這個問題的一個例子是統計有多少學生沒有通過考試。假設某學區有 1,000 名學生參加同一項測驗，其中 300 人未通過 (每 10 名應試者中有 3 人未通過)。我們可以探討，一個有 20 名學生的 A 班，其測驗表現是否與整個母體有所不同 (注意：為求簡化，我們假設測驗通過與否與學生是否在 A 班無關)。假設 A 班 20 名學生中有 10 人未通過測驗 (每 10 名應試者中有 5 人未通過)，未通過率是學區的兩倍。不過，如果使用卜瓦松信賴區間，則在 95% 的信心水準下，20 人班級的未通過率與學區平均值並無統計差異。如果我們將 A 班與全州 10 萬名學生 (每 10 名應試者中有 3 人未通過，即 10 萬人中有 3 萬人未通過) 進行比較，相較於 A 班與郡的比較 (1,000 名應試者中有 300 名未通過)，兩者的 95% 信賴區間幾乎相同。這表示，就這項比較而言，觀察對象較少的 A 班 (僅 20 名學生) 不確定性遠高於觀察對象較多的母體。另一個班級 B 有 20 名學生，其中只有 1 人未通過測驗 (每 10 名應試者中有 0.5 人未通過)。套用 95% 信賴區間後，B 班的通過率確實與郡平均值 (以及州平均值) 有統計上的差異。這個例子表明，當比較兩個群體中事件的發生率時，其中一個群體比另一個群體大得多（以參加考試的人數或行駛里程來衡量），影響統計顯著性的兩個因素是：（a）較小群體中的觀察次數（觀察次數越多，越早出現顯著性）；（b）發生率的差異越大（差異越大，越早出現顯著性）。
  現在考慮用 Waymo 數據做另一個實驗。在下圖中，Waymo 涉及任一車輛安全氣囊作動事故的次數 (34 次) 和 VMT (7,110 萬英里) 維持不變，人類基準母體的行駛里程則假設處於不同數量級 (基準率為每百萬英里 1.649 起事故，行駛里程為 178 億英里)。點估計值顯示，Waymo 的事故率比基準值少 71%。信賴區間 (有時也稱為誤差線) 反映了此降幅在 95% 信賴水準下的不確定性 (95% 信賴水準是大多數統計檢定的標準)。如果誤差線未跨越 0%，表示從統計學角度來看，我們有 95% 的信心認為結果並非偶然，這也稱為統計顯著程度。這項「模擬」結果顯示了基準母體的 VMT 變化對統計顯著程度的影響。即使基準母體的行駛里程 (假設為 1,000 萬英里) 少於 Waymo 母體，這項比較仍具統計顯著性。此外，只要人類基準的里程數超過 1 億英里，比較結果的信賴區間幾乎沒有明顯差異。這表示，從統計學的角度來看，美國大型城市 (數十億英里) 的比較結果，與美國全境年度駕駛里程 (數兆英里) 的比較結果並無差異。如同學校測驗的例子，Waymo 已累積足夠的行駛里程 (數千萬到數億英里)，且事故率降幅夠大 (70% 至 90%)，因此可達到統計顯著程度。
- 1.3. Waymo 的方法是否經過同儕審查或外部驗證？
  本分析利用了以下文獻中介紹的方法和人工基準： Scanlon 等人（2024 年） Kusano 等人（2024 年），以及Kusano 等人（2025）
  這些研究論文已發表在同行評審的科學期刊上。
  參考書目
  Scanlon, J. M.、Kusano, K. D.、Fraade-Blanar, L. A.、McMurry, T. L.、Chen, Y. H. 和 Victor, T. (2024 年)。〈Benchmarks for Retrospective Automated Driving System Crash Rate Analysis Using Police-Reported Crash Data〉(自動駕駛系統事故率之回顧性分析研究基準：使用警方通報事故資料)，交通傷害預防，25（增刊 1），S51-S65。
  Kusano, K. D.、Scanlon, J. M.、Chen, Y. H.、McMurry, T. L.、Chen, R.、Gode, T. 和 Victor, T. (2024 年)。〈Comparison of Waymo Rider-only crash data to human benchmarks at 7.1 million miles〉(里程數 710 萬英里時，Waymo 無人駕駛模式事故資料與人類駕駛基準的比較)。交通傷害預防，25（增刊 1），S66-S77。
  Kusano, K. D.、Scanlon, J. M.、Chen, Y. H.、McMurry, T. L.、Gode, T. 和 Victor, T. (2025 年)。Comparison of Waymo Rider-Only Crash Rates by Crash Type to Human Benchmarks at 56.7 Million Miles (里程數 5,670 萬英里時，Waymo 無人駕駛模式各類事故之發生率與人類基準的比較)。交通傷害預防，26（增刊 1），S8-S20。 https://doi.org/10.1080/15389588.2025.2499887 。
  同行評審，即研究論文提交給期刊，由該研究領域的專家匿名研究人員進行評審，並提出改進建議。同儕審查程序一直是研究發表的黃金標準。這項程序要求研究內容的詳盡程度須足以重現結果，且結論須由結果支持。安全影響資料中心採用的方法與同儕審查論文一致，因此具備一定程度的透明度。如同學術出版界的慣例，我們通常會在文章接受同儕審查期間發表預印本。此舉旨在分享研究成果，並邀請科學界交流指教。
- 1.4. 研究人員是否能取得原始數據？
  可以，資料中心上的結果可利用公開數據重現。如問題 1.1.2 所述，所有 Waymo 事故數皆基於依據 NHTSA 常設一般命令 (SGO) 通報的事件。此外，我們提供資料中心所有統計資料的原始數據 CSV 檔案，方便研究人員或其他第三方重現並驗證結果。這些數據包括各個地區的行駛里程 (CSV1)、分析中每個案件的 SGO 案件識別碼和事故結果類別 (CSV2)、與基準事故率的比較 (按地區、事故結果和事故類型彙整) (CSV3)，以及用於動態區域調整的市內分區行駛里程 (CSV4)。資料中心所使用的方法，均源自於可公開閱覽的同儕審查論文 (引用文獻請見問題 1.3)。
- 1.5. 為什麼結果是以每英里事故車輛數呈現？
  車輛事故率或車輛層級的事故率，是將特定結果的事故車輛數量，除以母體層級的 VMT 計算而得。Waymo 車輛事故率的計算方式：特定結果事故所涉及的 Waymo 車輛數，除以 Waymo 無人駕駛模式 (RO) 總行駛里程。基準值的計算方式：警方通報資料中，特定結果事故所涉及的車輛總數，除以總母體 VMT。
  另一項可供參考的指標則是事故層級事故率，即每單位母體 VMT 的事故數量。不過，若使用事故層級基準來比較自動駕駛系統 (ADS) 車隊的車輛層級事故率，會因為單位不符而導致錯誤結論，以下舉一個簡單的例子來說明。假設基準母體有兩輛車，各行駛 100 英里後相撞 (即 2 輛事故車、1 起事故、母體 VMT 為 200)。事故層級的事故率為每 100 英里 0.5 起 (1 起事故除以 200 英里)，車輛層級的事故率則為每 100 英里 1 輛 (2 輛事故車除以 200 英里)。從警方通報的事故資料 (平均每起事故涉及 1.8 輛車) 和 VMT (基於所有車輛的估算值) 推導基準時，也要注意類似的單位不符情形。現在假設第二個 ADS 母體有 1 輛車，這輛車同樣在行駛 100 英里後發生事故，且事故對象為母體外的車輛。這種情況也反映了 ADS 車隊資料的實際收集方式，系統會記錄 ADS 車隊的總 VMT，以及涉及 ADS 車輛的事故。就 ADS 車隊而言，車輛事故率 (車輛層級) 為每 100 英里 1 輛事故車。如果誤將每 100 英里 0.5 起事故的事故層級基準率，與每 100 英里 1 輛事故車的 ADS 車輛層級事故率進行比較，結論會是 ADS 車隊的事故率是基準的 2 倍。事實上，在這個例子中，ADS 的事故率（每 100 英里發生 1 起事故）與基準事故率（即車輛駕駛員每行駛 100 英里發生 1 起事故）並無不同。
  使用匯總統計數據時，很容易犯下將事故發生率與車輛發生率進行比較的錯誤，因為研究機構提供的匯總統計數據通常列出的是事故數量，而不是事故中涉及的車輛數量。舉例來說，Scanlon 等人 (2024 年) 的報告指出，2022 年全美共有 5,930,496 起警方通報的事故，涉及 10,528,849 輛車。2022 年全國 VMT 總計為 3.2 兆英里。也就是說，美國的事故層級事故率為每百萬英里 1.9 起，車輛層級事故率則為每百萬英里 3.3 輛。
  交通安全領域的另一個常見指標是每單位 VMT 的受傷人數 (即人員層級事故率)。以人員層級事故率衡量整個母體的事故負擔，具有一定價值。然而，在進行不同母體間的比較時 (例如安全影響資料中心的做法)，使用人員層級事故率會存在實務和解釋上的問題，因此這項指標並不理想。即使事故涉入率維持不變，在混合交通環境中行駛的 ADS 車隊，其人員層級事故率也會隨著車隊規模 (或普及率) 增加而呈現下降趨勢。由於事故通常涉及多輛車，因此車隊規模越大，多部 ADS 車輛涉及單一事故的可能性就越高，人員層級事故率也就跟著降低 (事故涉及的人數不變，但 VMT 增加)。這表示在測試初期，即使 ADS 車隊涉入的事故數量與基準母體不相上下，人員層級事故率仍會顯得比基準高。如要解決這個偏誤，可以計算依比例分攤的人員層級事故率：將涉及特定結果事故的總人數除以事故車輛數，再將所得結果除以 VMT。雖然這種依比例分攤的人員層級事故率可解決涉及多輛車時的偏誤問題，卻會導致結果解讀出現另一種偏誤，亦即計算依比例分攤的人員層級事故率時，相較於涉及多輛車的事故，涉及較少車輛的事故有較高的權重。此外，實務上的限制在於，NHTSA 常設一般命令 (SGO) 做為最完整的 ADS 事故資料來源，僅通報事故中的最高傷勢嚴重程度，而非各級傷者人數。因此，目前無法從 SGO 資料計算人員層級事故率。部分州級事故資料庫也有類似限制，只會通報最高傷勢嚴重程度。鑒於可能的解讀偏誤和通報記錄的限制，比較 ADS 和基準事故率時，車輛層級事故率會比人員層級事故率更理想。
- 1.6. 每英里事故數和事故間隔英里數有何不同？
  從數學上來看，每英里事故數和事故間隔英里數互為倒數，也就是說，要從每英里事故數換算成事故間隔英里數，只要將 1 除以每英里事故數即可 (反之亦然)。不過，RAVE 檢查清單建議，事故率應以每英里事故數表示，其中的重要原因在於，每英里事故數與事件數量呈線性關係，而事故間隔英里數的倒數則呈非線性關係。這種非線性關係使得比率變動的比較更為困難。在其他測量中也發現了類似的困難，例如車輛燃油效率（每加侖行駛里程與每百英里行駛加侖數）。
  如前所述RAVE 清單：「考慮一下，某款自動駕駛汽車的事故發生率是每行駛 100 萬英里發生一次事故，而基準事故發生率為每行駛 75 萬英里發生一次事故。另一個 ADS 的每起事故行駛里程為 50 萬英里，基準值則為每起事故 25 萬英里。在這兩種情況下，每起事故的行駛里程都是相差 25 萬英里，表現看似差異不大。然而，事實正好相反，在第一個比較組中，ADS 將每英里事故數減少了 25% (1 IPMM vs 1.33 IPMM)，第二組的 ADS 則減少了 50% (2 IPMM vs 4 IPMM)。由於每暴露單位的事故率與事故總數成線性正比，而每起事故的暴露單位率則為非線性相關，因此很難比較這類比率的相對差異，但這點往往並不容易察覺。」
  Figure 2 from RAVE checklist
2. 這些研究結果的意義為何？
- 2.1. 從安全影響結果中，我們可以得出什麼結論？
  - 2.1.1. 這項資料是否代表 Waymo Driver 比人類駕駛更安全？
    研究顯示，若以每單位行駛里程的特定結果事故率來衡量，在 Waymo 營運的相同區域內，Waymo Driver 的表現比整體人類駕駛母體更安全。這項研究旨在比較 Waymo Driver 與同一地理區域內所有人類駕駛車輛的安全表現。人類駕駛的事故率可視為該地區的「現狀」。在安全影響分析中進行這項比較，可判斷導入 Waymo 技術相較於現狀的成效。
    人類駕駛的整體事故率是極為普遍的指標，這是因為美國和世界大部分地區的資料通報慣例幾乎已全面常態化且行之有年。針對整個地理區域駕駛母體所面臨的逐年趨勢與系統性挑戰，過往已有豐富的研究先例。雖然我們能從事故資料深入瞭解各種數據子集 (例如車輛類型或酒駕/藥駕等因素)，但對應的 VMT 資料通常缺乏同等的解析度，無法進一步拆解。舉例來說，如要比較酒駕/藥駕等因素造成的事故率，就必須知道酒駕/藥駕者的 VMT，或估算這項數據。目前，在區分特定駕駛族群並全面分析事故風險方面，投入的資源仍明顯不足。
  - 2.1.2. 安全影響研究的結果是否意味著 Waymo「夠安全」？
    雖然統計上顯著的事故率降幅已證明 Waymo 具備安全效益 (即事故數量減少)，但其實 Waymo 在發布 ADS 組態前，就已透過安全架構和安全論證提出「夠安全」的聲明。安全影響分析的目標，並非定義自動駕駛系統的安全水準是否合理。Waymo 會根據特定軟體候選版本的核准指南，使用安全架構來判斷安全準備度。此外，這類程序的適當性也會透過安全論證進行獨立分析。安全論證是一種正式的框架，可解釋 ADS 開發人員如何判斷系統是否夠安全，能在無人駕駛的情況下部署於公共道路；其中包含用於正式判定不存在不合理風險的證據、系統說明、用以驗證系統的方法和指標，以及驗證測試的實際結果。相對地，安全影響資料中心提供的回溯性證據，則能在部署後驗證安全架構和安全論證。這種持續強化信心的循環，也讓 Waymo 確信，隨著公司擴展至新領域，安全架構和安全論證程序將能帶來一致的安全影響結果。
  - 2.1.3. Waymo 多個軟硬體版本的表現有何差異？
    大部分的安全影響研究，都使用無人駕駛模式 (RO) 至今所有累積里程中發生的事故資料。Waymo 的行駛里程隨著時間推移大幅增加，使得較近期的資料在 Waymo 行駛里程中的占比高於早期里程。如同「為什麼 Waymo 無人駕駛模式與基準事故率的比較，沒有細分成更多類別？」的解答，進一步細分行駛里程會降低分析的統計檢定力，這也是其他攸關安全的領域常見的限制。
    針對傳統安全系統所做的安全影響研究，經常會彙整多個軟體版本、甚至是不同製造商的資料。舉例來說，美國公路安全保險協會和 PARTS 聯盟有許多關於自動緊急煞車或車道偏離輔助等技術的研究，藉由彙整數家製造商的資料，來判斷技術的整體影響。同樣地，Waymo 安全影響研究也呈現了 Waymo Driver 的整體影響。隨著行駛里程的增加，我們有機會在較短的時間內研究 Waymo 的安全影響。
    Waymo 的安全影響研究旨在回答以下研究問題：“與當前人類駕駛車輛的碰撞率（現狀）相比，Waymo 的安全影響是什麼？” 另一個值得探討的面向是：「Waymo 如何確信新發布的軟硬體版本安全無虞？」為了回答第二個問題，Waymo 制定了安全架構和安全論證方法。簡言之，這些程序會根據驗收標準，透過橫跨車輛架構、駕駛行為及營運層面的一系列方法，評估 Waymo 每個新候選組態的表現。
  - 2.1.4. Waymo 對致死事故的影響為何？
    包括致死事故涉入率在內，Waymo 已發布多項預計於未來評估中使用的基準指標。目前我們未在報告中將「僅限致死事故」獨立列為一類，因為 Waymo 在營運區域的 VMT 尚不足以達到統計顯著性。Waymo Driver 的設計本質上也是為了化解或根除導致致死事故的主要因素。根據 NHTSA 的最新資料，這些因素包括：超速、受影響駕駛、分心駕駛，以及乘客未繫安全帶。「重傷以上」類別同時包含了重傷與致死事故。此外，所有其他的事故結果類別也都會計入致死事故。
    Waymo 的做法是：(a) 主動發布基準、方法論和擬採用的分析視角；(b) 若先前完成的統計檢定力分析顯示可能達到顯著性，即根據這些既定基準進行評估；(c) 在我們的資料中心和科學期刊中發布研究結果。
    正如汽車安全史上許多安全創新技術的發展歷程，在廣泛部署和累積里程之前，還有其他方法可以評估技術的潛力。舉例來說，我們在一項研究中，還原了亞利桑那州錢德勒市涉及人類駕駛的致死事故。結果發現，當 Waymo Driver 為主動方時，能 100% 避免模擬的致死事故；即使做為被動方，也能避免 82% 的碰撞。這類研究結合 Waymo 的安全準備度判定程序，充分印證了 Waymo Driver 深具減少嚴重傷亡的潛力。
  - 2.1.5. 「重傷以上」和「致死」事故率有何差異？
    在汽車安全領域，達到或超過特定程度的傷害是常見的研究對象。在本分析中，「重傷以上」包括疑似嚴重傷害 (美國警方報告使用的 KABCO 量表標示為「A」級或失能傷害) 和致死傷害 (KABCO 量表標示為「K」級傷害)。Waymo 已經公布相關基準，將「K」級事故 (即致死) 獨立列為一類。目前，安全影響資料中心尚未列入這項事故結果，但我們計畫在未來加入這項指標。
    如果單獨觀察「重傷」(僅限「A」級傷害)，可能會導致某種排除偏誤。舉例來說，在未計入「致死」傷害的情況下，如果某種介入措施只會導致死亡結果，且極少產生疑似重傷結果，得出的結論可能會嚴重失準，誤認為該措施比實際來得安全許多。藉由加入「以上」這項規定，即可避免這種潛在的謬誤。
- 2.2. 在安全影響研究中，是否還有其他未考量的因素，會影響對結果的解讀？
  - 2.2.1. 如果偶爾需要人類遠端協助，Waymo 真的能稱為自動駕駛嗎？
    Waymo 安全影響研究探討了 Waymo 車輛與人類駕駛車輛在同質區域行駛時的事故率差異。在某些棘手或罕見的情況下，Waymo Driver 可透過遠端協助功能聯絡真人專員，取得額外資訊來幫助判讀周遭環境。遠端協助自始至終都是 Waymo Driver 設計的一部分，且確實也是 Waymo 能安全擴大營運規模的原因之一。Waymo 的遠端協助計畫已通過獨立第三方稽核，顯示該計畫符合相關領域的業界最佳做法，且根據業界標準，確實符合自動駕駛的定義 (1、2)。
  - 2.2.2. 自動駕駛車輛能否應對人類駕駛生涯中的所有嚴苛考驗？
    Waymo Driver 目前每週行駛數百萬英里；若結合道路與模擬環境的累積里程，它已具備相當於人類數百輩子的駕駛經驗。在這樣的里程規模下，許多具挑戰性的情境 (例如行人突然從停放的車輛後方竄出，或有車輛闖紅燈) 都已成為常態。如果 Waymo 駕駛員無法應對人類一生中遇到的許多挑戰性情況，那麼 Waymo 的事故率就不會比人類駕駛員低那麼多。
    Waymo 會根據特定軟體候選版本的核准指南，使用安全架構來判斷安全準備度。此外，這類程序的適當性也會透過安全論證進行獨立分析。安全論證是一種正式的框架，可解釋 ADS 開發人員如何判斷系統是否夠安全，能在無人駕駛的情況下部署於公共道路；其中包含用於正式判定不存在不合理風險的證據、系統說明、用以驗證系統的方法和指標，以及驗證測試的實際結果。
  - 2.2.3. 如果 Waymo 車輛在載客的空檔行駛 (也稱為空車行駛)，由於車內無人駕駛，也就沒有人員受傷之虞，這是否也算是一種安全效益？
    Waymo 的安全影響研究結果顯示，與人類駕駛車輛的現狀相比，Waymo 每單位行駛里程所造成的致傷事故較少。這項效益的部分原因在於，Waymo 車內有時空無一人 (例如：車輛往返場站充電或載客空檔)。值得注意的是，Waymo 安全影響研究所採用的指標，會將事故過程中受傷的任何人員納入考量，不論傷者是否在 Waymo 車內，這包括行人、自行車騎士等弱勢用路人，以及事故中其他車輛的乘員。因此，即便 Waymo 的空車行駛時段能帶來一些統計上的優勢，但單憑這點，仍不足以解釋 Waymo 為何能大幅減少致傷事故 (即使車輛全程空車行駛，仍可能發生事故而導致車外的人受傷)。此外，人員傷亡以外的其他事故結果 (例如安全氣囊作動指標) 並不受 Waymo 車輛乘載狀況的影響。無論車內是否有人，Waymo 車輛的安全氣囊都會正常觸發。與基準相比，安全氣囊作動次數的減幅與致傷事故的減幅相近，這進一步證實了觀察到的效益並非高度取決於 Waymo 車輛的乘載狀況。
  - 2.2.4. Waymo Driver 的表現評估涵蓋了哪些情境？
    將 Waymo 和人類的事故及駕駛數據「對齊」，是公平比較事故率 (即立足點相同) 最關鍵的因素之一 (關於對齊的詳細說明請參閱問題 1.1)。
    我們初期聚焦於三個對安全評估具參考價值的關鍵要素。
    碰撞嚴重程度：建立多個等級，最低為警方通報，最高為致死。
    事故類型 (如下所示)：我們選用的分類法是根據 NHTSA 先前的研究，該研究指出了最具挑戰性的駕駛情境。
    道路類型：我們按一般道路與高速公路區分事故率。Waymo 目前的高速公路里程數有限，因此我們的研究重點僅限於一般道路。不過，待 VMT 足以進行統計比較時，我們計畫在後續報告中將這兩類道路區分開來。
    我們正積極開發更多元的分析維度，用以評估 Waymo Driver 的表現，不過實務上仍受限於目前可取得的人類駕駛事故資料。儘管 Waymo 高度仰賴公開的事故與里程資料，但這些數據提供的資訊有限，無法充分反映每起人類駕駛事故的具體細節。相比之下，Waymo 的自有數據資訊含量極高，因為我們不僅持續監控 VMT，更能透過多種感測器完整記錄每一起事故。為擴大分析面向，我們將持續探索資訊更詳盡的新資料來源，並積極尋求研究社群的支援，以獲取更多能支持相關研究的分析與數據。
  - 2.2.5. 安全影響分析是否包含自動駕駛車輛意外停駛的風險？
    安全影響分析涵蓋所有 Waymo 車輛在無人駕駛模式 (RO) 下發生的事故。因此，Waymo 車輛停在道路上並遭後車追撞的風險，已納入安全影響分析中。人類駕駛基準也包含這類靜止車輛碰撞事故。
  - 2.2.6. 為什麼不公開這些碰撞事故的肇事責任資訊？
    本分析涵蓋所有碰撞事故，不論肇事方為何，亦不論 Waymo 的責任歸屬。再者，關於引發或促成碰撞的肇事責任，屬於法律上的認定。話雖如此，由 Swiss Re 主導且通過同儕審查的近期研究顯示，在超過 380 萬英里的行駛里程中，Waymo Driver 將財物毀損保險索賠的頻率降低了 76%，人身傷害索賠的頻率則完全降為零，遠勝人類駕駛。引用文獻：Di Lillo, L.、
    引用內容
    迪利洛，L. Gode, T.、Zhou, X.、Atzei, M.、Chen, R. 和 Victor, T. (2024 年)。〈Comparative safety performance of autonomous-and human drivers: A real-world case study of the Waymo Driver〉(自動駕駛與人類駕駛的安全表現比較：Waymo Driver 實際案例研究)，Heliyon，10（14）。 https://doi.org/10.1016/j.heliyon.2024.e34379
    隨後一項使用保險索賠數據的研究（目前正在接受同行評審）發現，與人類駕駛超過 2500 萬英里相比，Waymo RO 服務同樣實現了大幅減少事故。除了整體人類駕駛基準外，這項新研究還引入了「新年式」車輛基準。與全體車輛相比，最新車款 (2018 至 2021 年式) 的財物毀損和人身傷害索賠率較低。相較之下，Waymo 與全體車輛相比，財物毀損索賠率減少了 88%，人身傷害索賠率減少了 92%；而與最新車款相比，財物毀損與人身傷害的索賠率則分別減少了 86% 與 90%。所有這些差異都具有統計意義。
    引用內容
    迪利洛，L. Gode, T.、Zhou, X.、Chen, R. 和 Victor, T. (2024 年)。〈Do Autonomous Vehicles Outperform Latest-Generation Human-Driven Vehicles?A Comparison to Waymo’s Auto Liability Insurance Claims at 25 Million Miles〉(自動駕駛車輛的表現是否優於最新一代的人類駕駛車輛？Waymo 行駛 2,500 萬英里的汽車責任險理賠比較)。
  - 2.2.7. 自動駕駛車輛可能帶來的 VMT 增加，對於整體淨安全會產生什麼影響？
    目前，Waymo 服務的水準已經和人類駕駛的叫車服務不相上下。資料顯示，Waymo 可將重傷以上、安全氣囊作動，以及人員受傷通報事故的機率降低 80% 以上。若要讓引進 Waymo 導致事故總數淨增加，總 VMT 必須成長 80% 以上，但如此大幅的里程增長顯然不切實際。事實上，許多研究顯示，導入共享自動駕駛車輛反而能大幅減少總 VMT 和用路車輛 (例如 1、2、3、4、5)。
- 2.3. Waymo 在提升整體道路安全方面扮演什麼角色？
  - 2.3.1. 為什麼不等到技術完全成熟後，再擴大部署自動駕駛車輛？
    交通安全是公共衛生議題，2030 年永續發展議程設定了遠大的目標，希望在 2030 年前，將全球道路交通死傷人數減少 50%。RAND Corporation 的研究模擬了自動駕駛系統 (ADS) 在多種假設下的部署情況，包括：部署事故率僅略低於目前人類駕駛的系統，或是等待多年後再部署事故率遠低於人類駕駛的系統。結果顯示，越早部署，就能避免越多傷害。
    Waymo 制定了安全架構和安全論證方法，最高目標是部署不存在不合理風險 (AUR) 的無人駕駛 (RO) 系統。為達成這個安全論證目標，我們將系統的潛在危險分解成多個維度，設定驗收條件，透過在部署前評估各項聲明和證據，確保 Waymo Driver 具備可接受的安全性。
  - 2.3.2. 自動駕駛車輛或許比人類駕駛更安全，但我們是否應該專注在其他現有的解決方案？
    要解決交通安全危機，不必仰賴單一技術或政策計畫。其中，自動駕駛車輛 (例如 Waymo Driver) 便是提升交通安全的眾多工具之一。Waymo 秉持安全系統做法和零死亡願景，透過多項改進措施，打造更安全的道路、更安心的車速、高防禦性的車輛、高素養的用路人，以及更完善的事故後照護。同樣地，許多安全改善措施 (例如投資建設更安全的道路、訂定安全速限、加強交通執法、提高安全帶使用率、減少受影響駕駛等) 也都能讓搭乘 Waymo 變得更加安全。Waymo 與業界多數公司一樣是私營企業，我們作為一個社會，可以在不影響其他安全改善措施的前提下，支持自動駕駛汽車的普及。
    與其他安全技術相比，自動駕駛汽車具有獨特的優勢，因為它對安全的影響比人類駕駛大得多。舉例來說，自動緊急煞車系統可將追撞事故 (僅占所有事故的四分之一左右) 減少約 50%。相較之下，Waymo Driver 在所有類型的事故中，可減少約 80% 的人員受傷通報事故，包括現有主動安全技術尚未能有效減少的十字路口和弱勢用路人 (VRU) 事故。
  - 2.3.3. 自動駕駛車輛如何融入「安全系統」架構？
    安全系統架構源自全球性的「零死亡願景」運動，旨在藉由系統性的方法消除道路運輸系統中的嚴重死傷。Waymo 自動駕駛車輛遵循零死亡願景的原則設計，是安全系統策略的重要一環。除了要求所有乘員都必須繫上安全帶，Waymo 系統的設計也嚴格遵守速限，並採用配備最新被動安全功能的車輛。
- 2.4. 共享資料包含哪些內容？
  - 2.4.1. 資料更新頻率為何？
    在本分析中，我們使用公開資料，即 Waymo 根據 NHTSA 常設一般命令 (SGO) 提交的事故報告，以便其他研究人員能重現分析結果。本網頁上顯示的數據會根據 NHTSA SGO 報告時間表進行持續更新。
    除了發布新數據外，我們可能會更新用於比較 Waymo RO（僅限騎手）服務和人類基準的方法。回溯性安全影響分析的最佳做法是一門不斷演進的科學。若分析方法有所變動，我們會對外發布異動內容，並說明其對結果和資料解讀的影響。詳情請參閱下載專區的版本資訊文件。
  - 2.4.2. 為什麼 Waymo SGO 下載資料包含事故日期、地點和郵遞區號？
    這些資訊對於分析和瞭解碰撞情況至關重要，但未包含在 NHTSA SGO 中。NHTSA SGO 通報表單自 2025 年 6 月起移除了郵遞區號欄位，因此之後的資料不再包含郵遞區號 (請參閱 SGO 修正案 3)。直到 2025 年 9 月 SGO 通報表單恢復郵遞區號欄位後，事件資料下載檔案才重新提供郵遞區號資訊。

安全性研究報告

我們積極投入研究，並針對安全方法、成效資料等項目，發表經同儕審查的發現結果

閱讀我們的出版品

Download Data

Miles per Geo
Total miles driven in each location (through March 2026)

Download CSV
Crashes with SGO identifier and group membership
A list of cases with outcome group and other relevant collision information (through March 2026)

Download CSV
Collision count and comparisons to benchmarks by outcome and location
Aggregated by outcome and location (through March 2026)

Download CSV
Geographic distribution of benchmark and Waymo RO miles
Human benchmark crash counts for different outcome levels, human vehicle miles traveled (VMT), and Waymo RO miles reported by S2 cell through March 2025. This information can be used to reproduce the dynamic benchmark adjustments.

Download CSV
Release Notes
A description of changes to the data and methodologies used on the data hub, links to historical data, and data dictionaries.

Download PDF

Making roads safer

Rider-only (RO) miles driven

Waymo Driver compared to human benchmarks

Waymo Driver compared to human benchmarks

Serious Injury or Worse Crash Rates

Any-Injury-Reported Crash Rates

Airbag Deployment in Any Vehicle Crash Rates

Waymo 車輛安全氣囊作動事故率

Waymo Driver compared to human benchmarks

Waymo crash rate percent difference to benchmark

Percent of Waymo Driver collisions with <1mph change in velocity

% of SGO Collisions with less than 1mph change in velocity (Delta-V <1mph)

Waymo Driver 與人類駕駛基準比較：依事故類型

任一車輛安全氣囊作動的事故

人員受傷通報的事故

Methodology

Methodology

Comparing autonomous vehicle and human performance

How we select Waymo incidents noted in this hub

Human benchmarks

Confidence intervals and data limitations

常見問題

1. 研究結果是否可信？#Copy link

1.1. 安全影響研究是否在相同基準上，公平地比較 Waymo 與人類駕駛？#Copy link

1.1.1. 安全影響研究的設計與執行方式為何？#Copy link

1.1.2. Waymo 會通報所有事故嗎？#Copy link

1.1.3. Waymo 與人類駕駛資料統計的事故結果是否相同？#Copy link

1.1.4. 這項比較是否將天氣等不同駕駛條件納入考量？#Copy link

1.1.5. 為什麼這項比較是採用 Waymo 營運地區的所有人類駕駛做為基準？#Copy link

1.1.6. 為什麼 Waymo 會取整起事故過程中的最嚴重傷勢做為統計基準？#Copy link

1.2. Waymo 是否累積足夠的行駛里程，能夠得出具統計顯著性的可靠結論？#Copy link

1.3. Waymo 的方法是否經過同儕審查或外部驗證？#Copy link

1.4. 研究人員是否能取得原始數據？#Copy link

1.5. 為什麼結果是以每英里事故車輛數呈現？#Copy link

1.6. 每英里事故數和事故間隔英里數有何不同？#Copy link

2. 這些研究結果的意義為何？#Copy link

2.1. 從安全影響結果中，我們可以得出什麼結論？#Copy link

2.1.1. 這項資料是否代表 Waymo Driver 比人類駕駛更安全？#Copy link

2.1.2. 安全影響研究的結果是否意味著 Waymo「夠安全」？#Copy link

2.1.3. Waymo 多個軟硬體版本的表現有何差異？#Copy link

2.1.4. Waymo 對致死事故的影響為何？#Copy link

2.1.5. 「重傷以上」和「致死」事故率有何差異？#Copy link

2.2. 在安全影響研究中，是否還有其他未考量的因素，會影響對結果的解讀？#Copy link

2.2.1. 如果偶爾需要人類遠端協助，Waymo 真的能稱為自動駕駛嗎？#Copy link

2.2.2. 自動駕駛車輛能否應對人類駕駛生涯中的所有嚴苛考驗？#Copy link

2.2.3. 如果 Waymo 車輛在載客的空檔行駛 (也稱為空車行駛)，由於車內無人駕駛，也就沒有人員受傷之虞，這是否也算是一種安全效益？#Copy link

2.2.4. Waymo Driver 的表現評估涵蓋了哪些情境？#Copy link

2.2.5. 安全影響分析是否包含自動駕駛車輛意外停駛的風險？#Copy link

2.2.6. 為什麼不公開這些碰撞事故的肇事責任資訊？#Copy link

2.2.7. 自動駕駛車輛可能帶來的 VMT 增加，對於整體淨安全會產生什麼影響？#Copy link

2.3. Waymo 在提升整體道路安全方面扮演什麼角色？#Copy link

2.3.1. 為什麼不等到技術完全成熟後，再擴大部署自動駕駛車輛？#Copy link

2.3.2. 自動駕駛車輛或許比人類駕駛更安全，但我們是否應該專注在其他現有的解決方案？#Copy link

2.3.3. 自動駕駛車輛如何融入「安全系統」架構？#Copy link

2.4. 共享資料包含哪些內容？#Copy link

2.4.1. 資料更新頻率為何？#Copy link

2.4.2. 為什麼 Waymo SGO 下載資料包含事故日期、地點和郵遞區號？#Copy link

安全性研究報告

Miles per Geo

Crashes with SGO identifier and group membership

Collision count and comparisons to benchmarks by outcome and location

Geographic distribution of benchmark and Waymo RO miles

Release Notes

1. 研究結果是否可信？

1.1. 安全影響研究是否在相同基準上，公平地比較 Waymo 與人類駕駛？

1.1.1. 安全影響研究的設計與執行方式為何？

1.1.2. Waymo 會通報所有事故嗎？

1.1.3. Waymo 與人類駕駛資料統計的事故結果是否相同？

1.1.4. 這項比較是否將天氣等不同駕駛條件納入考量？

1.1.5. 為什麼這項比較是採用 Waymo 營運地區的所有人類駕駛做為基準？

1.1.6. 為什麼 Waymo 會取整起事故過程中的最嚴重傷勢做為統計基準？

1.2. Waymo 是否累積足夠的行駛里程，能夠得出具統計顯著性的可靠結論？

1.3. Waymo 的方法是否經過同儕審查或外部驗證？

1.4. 研究人員是否能取得原始數據？

1.5. 為什麼結果是以每英里事故車輛數呈現？

1.6. 每英里事故數和事故間隔英里數有何不同？

2. 這些研究結果的意義為何？

2.1. 從安全影響結果中，我們可以得出什麼結論？

2.1.1. 這項資料是否代表 Waymo Driver 比人類駕駛更安全？

2.1.2. 安全影響研究的結果是否意味著 Waymo「夠安全」？

2.1.3. Waymo 多個軟硬體版本的表現有何差異？

2.1.4. Waymo 對致死事故的影響為何？

2.1.5. 「重傷以上」和「致死」事故率有何差異？

2.2. 在安全影響研究中，是否還有其他未考量的因素，會影響對結果的解讀？

2.2.1. 如果偶爾需要人類遠端協助，Waymo 真的能稱為自動駕駛嗎？

2.2.2. 自動駕駛車輛能否應對人類駕駛生涯中的所有嚴苛考驗？

2.2.3. 如果 Waymo 車輛在載客的空檔行駛 (也稱為空車行駛)，由於車內無人駕駛，也就沒有人員受傷之虞，這是否也算是一種安全效益？

2.2.4. Waymo Driver 的表現評估涵蓋了哪些情境？

2.2.5. 安全影響分析是否包含自動駕駛車輛意外停駛的風險？

2.2.6. 為什麼不公開這些碰撞事故的肇事責任資訊？

2.2.7. 自動駕駛車輛可能帶來的 VMT 增加，對於整體淨安全會產生什麼影響？

2.3. Waymo 在提升整體道路安全方面扮演什麼角色？

2.3.1. 為什麼不等到技術完全成熟後，再擴大部署自動駕駛車輛？

2.3.2. 自動駕駛車輛或許比人類駕駛更安全，但我們是否應該專注在其他現有的解決方案？

2.3.3. 自動駕駛車輛如何融入「安全系統」架構？

2.4. 共享資料包含哪些內容？

2.4.1. 資料更新頻率為何？

2.4.2. 為什麼 Waymo SGO 下載資料包含事故日期、地點和郵遞區號？