Relying Solely on Postal Code to Assign People/Events to Census Geography or Ministry Geography

Size: px
Start display at page:

Download "Relying Solely on Postal Code to Assign People/Events to Census Geography or Ministry Geography"

Transcription

1 Relying Solely on Postal Code to Assign People/Events to Census Geography or Ministry Geography Health Analytics Branch Health System Information Management & Investment Division Ministry of Health and Long-Term Care 2011 APHEO Conference May 15-16, 2011

2 What will we cover? Why do we need crosswalks between different geographies? How do we assign people to different geographies based on their postal code? Limitations and considerations when using the PCCF and PCCF+ for assigning people and events What does this mean for public health epidemiologists when assigning people and events to small areas? 2

3 Why do we need crosswalks between different geographies? Different data sources collect different geographic locators: Census data are available at, among others, CD, CSD and DA levels Vital statistics until 2007 included municipality and postal code but from 2008 will have only postal code Utilization data varies: Hospital separations (DAD) and emergency room visits (NACRS) - ministry residence code (municipality) and postal code Medical claims (OHIP) and Continuing Care Reporting System (CCRS) - postal code Population health data - postal codes and/or municipalities Geographic level at which data are available may not be that needed for analysis Different data sources may have to be linked to paint a complete picture to answer research questions or to support program planning/decision making Postal Codes census geography and ministry geography 3

4 How do we assign people to different geographies based on their postal code? PCCF and Ministry Conversion Files using Single Link Indicator (SLI) The Postal Code Conversion File (PCCF) is a digital file which provides a correspondence between the Canada Post Corporation (CPC) six-character postal code and Statistics Canada s standard geographic areas (e.g. census subdivisions, dissemination areas, dissemination block and census tract) for which census data and other statistics are produced. Through the link between postal codes and standard geographic areas, the PCCF permits the integration of data from various sources. - Extracted from Postal Code Conversion File (PCCF) Reference Guide (July 2009 Postal codes) MOHLTC creates, on an annual basis, a series of conversion files that link postal code to ministry geographies: residence codes (municipality), public health unit and LHIN. PCCF+ - using random allocation PCCF+ Version 5 consists of a SAS control program and a series of reference files derived from the most recent Statistics Canada Postal Code Conversion File (PCCF) and 2006 postal code population weight file (WCF). - Extracted from PCCF + Version 5F User s Guide (July 2009 Postal Codes) GIS geographic location The latitude and longitude of the postal code, available in the PCCF, allow postal codes to be mapped spatially. With ArcGIS, or other GIS software, postal codes can be linked to other geographies based on their geographic location. 4

5 How are the Ministry Conversion Files Created? 2006 Census CSD (Stat Can) MOHLTC Residence Codes (MOHLTC) All CD/CSDs All Residence Codes LHIN PHU (MOHLTC) PCCF (Stat Can) SLI One CSD to Residence code, LHIN, PHU (MOHLTC) CSD CSD Postal Code, CSD, DA (MOHLTC) DA DA DA to LHIN Correspondence File (Stats Can) Postal Code to Residence Code & PHU (MOHLTC) Postal Code Postal Code to LHIN (MOHLTC) Postal Code CD/CSD/DA LHIN Residence Codes PHU (MOHLTC) 5

6 Update/Release Cycles for Different Geographies and Related Products PC (Canada Post) CD (StatsCan) CSD (StatsCan) CT (StatsCan) DA (StatsCan) Residence Code (MOHLTC) PHU (MOHLTC) LHIN (MOHLTC) PCCF/PCCF+ (StatsCan) Conversion Files (MOHLTC) 6

7 What should we be aware of when using PCCF and PCCF+ to assign people to census geography and MOHLTC geography? Objective: To demonstrate the advantages and limitations of different methods available for assigning people/events to other geographies based solely on postal code Methods To assess completeness of coverage and consistency of assignment in the same year, PCCF 2008, PCCF 2009 and PCCF 2010, and their related ministry conversion files were analyzed. To assess stability of assignment over time, longitudinal analyses were conducted. For longitudinal comparison, PCCF 2009, was used as the reference. Assignment based on PCCF 2009 was compared with that based on PCCF 2008 and PCCF Postal codes can be retired and reborn. PCCF includes retired as well as active records. Only active records are included in analysis conducted for this review. 7

8 Limitations of Postal Code Conversion Files (PCCF/PCCF+) PCCF contains multiple records for a postal code when the postal code straddles more than one block-face, dissemination block, or dissemination area. - Extracted from Postal Code Conversion File (PCCF) Reference Guide (July 2009 Postal codes) Over a third of all postal codes have multiple records Especially postal codes that cover multiple address ranges, rural routes and community mailboxes For example: Delivery via a Rural Route e.g. K8A6W3 PCCF 2009: 39 Active Records 8

9 Delivery Model Type (DMT) for Active Postal Code (PCCF 2009) Number of Active Postal Codes Delivery Model Type Description PC with Multiple Records All Active PC Percent Delivery to Face-Block Address 85, , % Delivery to an Apartment Building 753 8, % Delivery to a Business Building 605 5, % Delivery to a Large Volume Receiver 606 3, % Delivery via a Rural Route % General Delivery % Delivery to a Post Office Box (not a Community Mail Box) 1,644 2, % Delivery to a Large Volume Receiver (post office box) 2,036 2, % Delivery via a Suburban Service % Rural Postal Code (second digit of postal code is 0 ) 1,078 1, % Total 93, , % 9

10 Delivery via a Rural Route e.g. K8A6W3 Address Ranges Number Delivery Mode Delivery Installation Street Name City Odd RR 2 BEACHBURG RD PEMBROKE Even RR 2 BEACHBURG RD PEMBROKE 400 RR 2 COON HOLLOW TRAIL PEMBROKE Odd RR 2 FINCHLEY RD PEMBROKE 186 RR 2 FINCHLEY RD PEMBROKE RR 2 GREENWOOD RD PEMBROKE Odd RR 2 HIGHWAY 17 PEMBROKE RR 2 HILA RD PEMBROKE Even RR 2 INDIAN RD PEMBROKE Odd RR 2 INDIAN RD PEMBROKE 111 RR 2 INDUSTRIAL PARK RD PEMBROKE 361 RR 2 JEBWOOD TRAIL PEMBROKE Even RR 2 MCGONEGAL RD PEMBROKE RR 2 MCLAUGHLIN RD PEMBROKE RR 2 MEATH HILL RD PEMBROKE RR 2 STURGEON MOUNTAIN RD PEMBROKE RR 2 SUTHERLAND RD PEMBROKE RR 2 VALLEY VIEW PEMBROKE RR 2 ZION LINE PEMBROKE RR 2 STN MAIN PEMBROKE Extracted from Canada Post website 10

11 Delivery via a Rural Route e.g. K8A6W3 Address Ranges 11

12 How do PCCF and PCCF+ deal with multiple records? PCCF and Ministry Conversion Files The single link indicator (SLI) was created to deal with postal codes with multiple records. The single link indicator identifies the geographic area with the majority of dwellings assigned to a particular postal code. Users should be aware that only a partial correspondence between the postal code and other geographic areas is achieved when using the single link indicator. - Extracted from Postal Code Conversion File (PCCF) Reference Guide (July 2009 Postal codes) PCCF+ Records for most postal codes which serve more than one dissemination area including most rural postal codes and several classes of urban postal codes are assigned geographic codes based on a population-weighted random allocation among the possible dissemination areas and blocks. - Extracted from PCCF + Version 5F User s Guide (July 2009 Postal Codes) 12

13 Are all units in each geographic level covered with the different assignment methods? Completeness of coverage Even when all records are included, PCCF does not cover all units at every geographic level. When the record with the SLI is used, more areas are missed at every geographic level With all 3 PCCFs examined, only partial coverage is achieved at CSD, CT, DA and Residence Code levels. For example with PCCF 2009: No individual or event will be assigned to 4 CTs even if all records are used; 20 CTs will be missed if only SLI records are used. 46 CSDs are not covered if all records are used and 151 CSDs will be missed when only SLI records are used 13

14 Are all units in each geographic level covered with the different assignment methods? The table below shows the number of units covered by each PCCF by geographic level. PCCF 2008 PCCF 2009 PCCF 2010 Geographic Level # of Units All Records SLI Records All Records SLI Records All Records SLI Records CD CSD CT 2,136 2,132 2,117 2,132 2,116 2,131 2,117 DA 19,177 18,988 16,340 18,981 16,352 18,979 16,377 Residence Code* PHU* LHIN* * PCCF combined with Ministry conversion files Coverage is not an issue at PHU level (all PHU have postal codes linked to them). 14

15 Are all units in each geographic level covered with the different assignment methods? The table below shows the number of units not covered by each PCCF by geographic level. PCCF 2008 PCCF 2009 PCCF 2010 Geographic Level # of Units All Records SLI Records All Records SLI Records All Records SLI Records CD CSD CT 2, DA 19, , , ,800 Residence Code* PHU* LHIN* * PCCF combined with Ministry conversion files Coverage is not an issue at PHU level (all PHU have postal codes linked to them). 15

16 CSDs not Covered by All Records of Active Postal Codes in PCCF 2009 Indian Reserves & Settlements: 40 Unorganized: 3 Townships: 2 Total CSD: 45 Total population: 8, CSDs are not covered by records of active postal codes in PCCF * Not all are shown because of the scale of the map. Even with PCCF+, coverage is an issue when we try to drill below PHU level. 16

17 CSDs not Covered by SLI Records of Active Postal Codes in PCCF 2009 Indian Reserves & Settlements: 98 Unorganized: 7 Municipality: 1 Towns: 10 Townships: 38 Villages: 2 Total CSD: 156 Total population: 91, CSDs are not covered by the SLI records of active postal codes in PCCF * Not all are shown because of the scale of the map. Coverage is even more of an issue with SLI when we try to drill below PHU level. 17

18 Many postal codes are linked to multiple geographic units within the same PCCF Consistency of assignment in the same year In PCCF 2009, of the 275,117 active postal codes, 1,268 are linked to more than one CSDs. With SLI in PCCF, all people/events from these postal codes are linked to one CSD. With PCCF+, people/events are distributed across geographic units. As a result, two people living in the same household may end up being assigned to different units. Consistency is not an issue at PHU level (all postal codes are linked to one PHU). 18

19 Example of Multiple Active Records for a Single Postal Code N0H2T0 This map shows the northern portion of Grey Bruce Health Unit. In PCCF 2009, N0H2T0 has 561 active records with which it is linked to 9 CSDs and 33 DAs. The 9 CSDs are shown on the map. 19

20 Example of Multiple Active Records for a Single Postal Code N0H2T0 The one active record with SLI = 1 link the postal code to a single CSD, CSD (in brown). 20

21 How many postal codes are linked to different geographic units with PCCF for different years? Stability of assignment of over time Example: a person residing in any one of the 209 postal codes where the PCCF 2008 SLI record points to one DA while the PCCF 2009 SLI record points to another, the individual would appear to have moved from one DA to another. Number of Postal Codes PCCF 2009 vs. PCCF 2008 PCCF 2009 vs. PCCF 2010 Same Unit Different Units Same Unit Different Units At CD Level 273, , At CSD Level 273, , At CT Level* 235, , At DA Level 273, , At Residence Code Level 273, , At PHU Level 273, , At LHIN Level 273, , * PCCF combined with Ministry conversion files * Not all postal codes are linked to a census tract. Numbers of postal codes reported for by geography level are in relation to PCCF Stability is an issue at all levels of geography including PHU (27 postal codes are linked to different PHUs from PCCF 2008 to PCCF 2009). 21

22 Example of Postal Codes Linked to Different CSD in PCCF 2008 versus PCCF 2009 K8A6W3, K8A6W4 This map shows part of the Renfrew County and District Health Unit. With PCCF 2009, K8A6W3 is linked to CSD and K8A6W4 is linked to CSD

23 Example of Postal Codes Linked to Different CSD in PCCF 2008 versus PCCF 2009 K8A6W3, K8A6W4 This map shows part of the Renfrew County and District Health Unit. With PCCF 2008, both postal codes, K8A6W3 and K8A6W4, are linked to CSD

24 Other Considerations when choosing between PCCF, PCCF+ and Related Products Dealing with Invalid Postal Code Postal codes that do not match exactly to the PCCF or WCF, the first one to five characters of the postal code are used to try to assign partial geographic identifiers to the extent possible. This takes care of many situations where one or more characters of the postal code are invalid, but the first one to five characters are valid. Problem records include full diagnostic and reference information. Extracted from PCCF + Version 5F User s Guide User-Friendliness PCCF, with SLI, may be used with any database/ statistical software. As well, using the SLI records from PCCF, MOHLTC creates postal code conversion files which serve as crosswalks between postal code and census geography and ministry geography. This files provide easy conversions between postal code and other geographies without the need for statistical software. 24

25 PCCF (SLI) Choosing between PCCF, PCCF+ and Related Products Example: Vital Statistics, Deaths, 2007 To study the impact of Statistics Canada s decision to remove CSD from all vital statistics files (effective with 2008 births, stillbirths, deaths & marriages) Method: 1. Compare the CSD assigned based on the postal code with the PCCF Plus 2. municipality code Match recorded CSD as reference Mismatch (still available in Total 2007 files): Match CSD 72,280 (83%) 2,119 (2%) 74,399 (86%) Mismatch 1,011 (1%) 11,533 (13%) 12,544 (14%) Total 73,291 (84%) 13,652 (16%) 86,943 (100%) 2. The postal code type that make up each of the four category was analyzed. 25

26 Choosing between PCCF, PCCF+ and Related Products Example: Vital Statistics, Deaths, % 90% 80% 70% No error/warning message, 13% No error/warning message, 12% Note: Multiple records, (populationbased allocation), 13% No error/warning message, 7% 60% 50% 40% 30% No error/warning message, 83% Error: No Match to PCCF, 73% Note: Multiple records, (populationbased allocation), 87% Note: Multiple records, (populationbased allocation), 83% 20% 10% 0% Note: Multiple records, (populationbased allocation), 10% Warning: Commercial or institutional, 6% Warning: Non-residential, 6% Both match municipality recorded Both do not match SLI do not match PCCF+ match SLI match PCCF do not match Error: No Match to PCCF Warning: Non-residential Warning: Commercial or institutional Note: Multiple records (equal weight allocation) No error/warning message Error: Linked to PO geography Warning: Business building Warning: Retired postal code Note: Multiple records, (population-based allocation) 26

27 Assigning Postal Codes to Census Geography and Ministry Geography Postal Code PCCF ArcGIS (spatial join) PCCF+ SLI MOHLTC Conversion Files (SLI) Residence Code CD CSD CT DA PHU LHIN 27

28 What does it mean for public health epidemiologists? Which level of geography are you interested in? When trying to assign people/events to high level geographies, e.g. CDs or public health units, relying solely on postal codes is less of an issue. Given the limitations of coverage and consistency we have demonstrated, serious considerations have to be given when designing any data collection and maintenance initiatives Should we rely on postal code alone as the geographic locator? Since all methods available for assigning people/events, based solely on postal code, to other geographies have their pros and cons, users should select the method most appropriate for their purpose. Be aware of whether the geographic levels in the data set you are using were collected directly or were derived from postal code. For example, medical claims, CCHS, vital stats. Find out what method was used to allocate people/events. The choice depends on: The quality of the data to be linked, specifically the quality of the postal code field (PCCF+ can take care of invalid postal code). Analytical capacity (PCCF does not require the use of SAS while PCCF+ does. With GIS capabilities, postal code may be linked to other geographies based on their geographic location). 28

29 For more information, please contact: Elsa Ho Health Analytics Branch Health System Information Management and Investment Division Ministry of Health and Long-Term Care Tel: Carol Paul Health Analytics Branch Health System Information Management and Investment Division Ministry of Health and Long-Term Care Tel.: