Using Google s Aggregated and Anonymized Trip Data to Support Freeway Corridor Management Planning in San Francisco Bhargava Sana, Joe Castiglione, Dan Tischler, Drew Cooper SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY January 11, 2017
Outline Background on Freeway Corridor Management Study (FCMS) and OD data sources Google s Aggregated and Anonymized Trip (AAT) data AAT application and validation Estimate roadway facility-specific OD matrices Conclusions Limitations and next steps
Background: San Francisco Freeway Corridor Management Study (FCMS) US-101 I-280 Corridor Potential strategies TDM, managed lanes Ramp metering OD data could help impute Socio-demographics (age, income, auto ownership etc.) Willingness To Pay (WTP)
Background: OD Data Collection Conventional methods License plate surveys Roadside interviews/questionnaires Emerging passive data collection methods Bluetooth detectors Cell phone Call Detail Records (CDR) data
Aggregated and Anonymized Trip (AAT) Data Google s Better Cities Program Minimize congestion, improve safety and reduce infrastructure spending Users opt-in to share location GPS, cell-phone towers, Wi-Fi detectors Aggregated and Anonymized Trip (AAT) information from location reports Extract data from moving users Clean data and snap to road network Aggregate OD trip counts Apply differential privacy filters and minimum trips threshold 5
AAT Dataset for FCMS 85-district system: combination of Tract and County boundaries via polygons for 4 segments around US-101 and I-280 interchange in San Francisco Hourly AAT data for six months (Apr-Jun and Sep-Nov 2015) 6
AAT Application Considerations Flow data provided as relative trips as opposed to absolute counts Trips needed for planning purposes Convert relative flows to trips? 7
Caltrans Performance Measurement System (PeMS) Continuously monitors speeds and volumes using detectors Over 39,000 detectors on freeways all across California PeMS counts available for same time and location as AAT Why not compare to Google AAT flows? PeMS vehicle counts -> person trips Observed auto occupancy applied by location and time of day 8
Comparison to PeMS Corridor Flows Weekday Hourly Bi-Directional Flows Summary Freeway Google AAT Relative Flows PeMS (Vehs/Hr) Segment Count Mean Min Max Count Mean Min Max North on US-101 2,734 1.97E-03 2.34E-04 1.13E-02 1,845 9,502 132 15,885 South on US-101 2,734 1.91E-03 2.34E-04 1.02E-02 1,863 9,931 512 16,310 North on I-280 2,671 1.53E-03 2.34E-04 7.05E-03 1,872 3,844 108 8,216 South on I-280 2,732 1.59E-03 2.34E-04 7.72E-03 1,872 6,357 96 11,019 9
Comparison to SF-CHAMP (ABM) OD Flows Pearson s correlation = 0.995 Pearson s correlation = 0.994 10
Relative Flow Conversion Models AAT relative flows vary by Hour of day: added indicator variables Roadway segment location: separate models estimated Several model forms and transformations explored PT t = K + β f log RF t + β t + ϵ ; 0 t 23 PT t K β f RF t β t ϵ : Person trips on roadway facility for hour-of-day t : Regression constant/intercept : Coefficient of AAT relative flow : AAT relative flow for hour-of-day t : Constant for hour-of-day t : Error term 11
Model Validation 12
Comparison of Facility-Specific OD Matrices with SF-CHAMP Select Link Correlation Coefficient Time Period Segment County level District level US-101 north of interchange 0.948 0.440 AM Peak US-101 south of interchange 0.751 0.390 I-280 north of interchange 0.947 0.279 I-280 south of interchange 0.881 0.342 US-101 north of interchange 0.964 0.422 PM Peak US-101 south of interchange 0.685 0.334 I-280 north of interchange 0.940 0.253 I-280 south of interchange 0.846 0.312 13
Summary and Conclusions New passive dataset (AAT) obtained from Google Processed mobile users location reports Reasonable regional flow patterns High correlation with calibrated TDM flows Statistical relationship: AAT & observed counts (PeMS) Appears to vary by hour-of-day and facility Hourly facility-specific OD matrices O & D trips ends better correlated with model select link 14
Limitations and Next Steps Limitations Google s AAT data significantly processed to protect privacy Minimum trips threshold for reporting At the time of analysis AAT only available for vehicular traffic Analysis limited to freeway corridors AAT most likely represent person trips and not vehicle trips Auto occupancy assumptions may be needed Next Steps Compare AAT and cellphone CDR OD flows Leverage the longitudinal aspect to perform before-after studies 15
Thank You SAN FRANCISCO COUNTY TRANSPORTATION AUTHORITY
Estimation Results US-101 I-280 Coefficient Std. Err. t-stat Coefficient Std. Err. t-stat Intercept 16370.0 585.3 28.1 13060.0 561.9 23.2 log(weight) 1873.2 74.3 25.3 1537.7 69.9 22.0 Hour 0500-0600 2769.4 198.3 14.0 672.6 171.7 3.9 Hour 0600-0700 6606.5 259.5 24.0 3430.3 239.1 14.3 Hour 0700-0800 7412.0 290.9 24.0 5002.1 273.1 18.3 Hour 0800-0900 7632.8 259.8 27.8 4872.6 257.3 18.9 Hour 0900-1000 7912.7 239.5 33.0 4397.0 236.8 18.6 Hour 1000-1100 7576.5 223.3 33.9 3930.8 205.1 19.2 Hour 1100-1200 7540.3 215.4 34.9 3885.1 208.0 18.7 Hour 1200-1300 7218.6 215.7 33.5 3839.1 208.2 18.4 Hour 1300-1400 7180.7 214.3 33.6 4045.6 214.7 18.8 Hour 1400-1500 6666.8 218.8 30.5 4973.5 215.5 23.1 Hour 1500-1600 7545.4 225.8 33.4 5561.9 238.1 23.4 Hour 1600-1700 7265.0 250.7 30.1 5816.0 251.5 23.1 Hour 1700-1800 7688.0 255.5 31.2 5318.2 264.8 20.1 Hour 1800-1900 7928.7 241.3 34.0 5111.6 240.3 21.3 Hour 1900-2000 6825.3 227.9 30.0 4361.1 220.6 19.8 Hour 2000-2100 5663.6 215.6 26.4 2622.2 205.7 12.7 Hour 2100-2200 5084.6 207.2 24.6 2400.5 188.8 12.7 Hour 2200-2300 3958.5 201.2 19.8 1631.5 183.5 8.9 Hour 2300-2400 1926.8 183.7 10.6 768.6 163.9 4.7 No. of observations 2592 2592 Log-Likelihood -23774.0-23282.0 R-squared 0.831 0.744 Adj. R-squared 0.830 0.742 17