Overview & Update of UCSF Clinical Data and Tools for Research

Size: px
Start display at page:

Download "Overview & Update of UCSF Clinical Data and Tools for Research"

Transcription

1 Overview & Update of UCSF Clinical Data and Tools for Research Rick Larsen Director of Research Informatics, Academic Research Systems, UCSF IT

2 Clinical Data on the Move Clinical Decision Support Intelligent clinical devices, Learning Health Systems Patterns and Predictions Information Commons Retrospective Analytics EMR, Data Marts, Data Warehouses Clinical Data Colloquium for Research

3 Agenda Overview of Current UCSF Clinical Research Data Landscape including what s new/changed in the last 18 months Clinical Data Colloquium for Research

4 What clinical data are available at UCSF for research? UCSF electronic medical record (EMR) data: APeX data dating back to 2012 STOR data dating back to 1988 Images Plus additional data, such as: Geocoded address data CA Death Registry data ZSFG and other Department of Public Health data UC Health data (EMR data from UC Davis, UC Irvine, UCLA, UCSD, UCSF) for patient counts across UC Health Clinical Data Colloquium for Research

5 Current UCSF Clinical Research Data Landscape Note: CDW = Clinical Data Warehouse 5 Presentation Title and/or Sub Brand Name Here

6 UCSF Tools that Support Counts RDB- Research Data Browser Cohort exploration De-ID record drill down PatientExploreR Cohort exploration De-ID record drill down Potential replacement of RDB UC Rex Data Explorer I2b2 Interface Cohorts across the UC s 6 Presentation Title and/or Sub Brand Name Here

7 UCSF Tools that Support Counts Tool UCSF Data UC Health data available Drill Down to Row Level details Notes RDB Yes No Yes Evaluating replacement options (options include PatientExploreR, Trinetx, ) UC ReX Yes Yes No Part of the X-UC CDW project is to introduce a new tool PatientExploreR Yes No Yes Initial availability on Information Commons Clinical Data Colloquium for Research data.ucsf.edu

8 De-Identified Data Sets 8 Presentation Title and/or Sub Brand Name Here

9 De-Identified Data Sets Tool UCSF Data UC Health data available Drill Down to Row Level details Notes RDB Flat Files Yes No Yes Looking to Sunset in 2019 De-Identified CDW Information Commons Yes No Yes Details in following slides Yes No Yes Details in following section Clinical Data Colloquium for Research data.ucsf.edu

10 Identified Data Sets Availability & Services Clarity Clarity is a relational DB representation of Epic EMR DB ~13,000 Tables ~110,000 Columns Caboodle Clinical Data Warehouse (CDW) ~170 Tables ~2700 Columns Data from Epic and other sources ZSFG Multiple Legacy Systems X-UC CDW Coming Soon for Research 10 Presentation Title and/or Sub Brand Name Here

11 Identified Data Sets Availability and Services Data Epic EHR Data STOR Historical Data Date Ranges Current Notes Clarity & CDW CDW Delivery of UCSF & ZSFG Data CTSI ARS partnership to provide Centralized Honest Broker Services Significant improvements in time of delivery Limited resources and expanding demand data.ucsf.edu to request data ZSFG EHR Data X-UC CDW Current Current ZSFG legacy systems. Moving to Epic Cross- UC-wide Centralized CDW X-UC CDW (Cross UC, includes 5 UC Health EMRs) Governance and procedures in the works Clinical Data Colloquium for Research data.ucsf.edu

12 Getting to the Right Data can be Challenging Tracing Data back to the Source System The source data is focused on Clinical Care and Operations (Not Research) Different data elements that are similar but different (e.g. multiple Diagnosis codes) It often takes a team effort Clinical Data Colloquium for Research

13 What s been added to UCSF data for research in the last 18 months Clinical Data Colloquium for Research

14 STOR Data going back to 1982, now available Ambulatory medical record system created at UCSF ~3,000,000 Patients Demographics, encounters, diagnosis, procedures Other clinical information (e.g. Lab System) Physician-generated clinical data (e.g. Problem Lists) Clinical notes (Dictated into STOR) 14 Presentation Title and/or Sub Brand Name Here

15 Geocoded Data now available ~6 million addresses extracted from the UCSF EHR 87% geocoded (x,y) coordinates on map Census tract/block, etc Census tract/block/zip data added back into CDW Use Cases Supported: Describe geographic distribution of different medical conditions Analyze neighborhood factors that might cause disease or contribute to disparities Look for place-based opportunities for interventions (e.g.,schools, churches, community centers) Clinical Data Colloquium for Research

16 California Death Registry Data We now date of death for 163,000 + Prior to this we had ~16,000 CA Death Registry Date added a new field in the CDW Monthly Updates Use Cases Supported: Exclude from Trial Recruitment Availability in upcoming release of the De-ID CDW (Dates shifted) Request death dates as part of a IRB approved Clinical Data Request (Working on getting official permission from the CDPH) Clinical Data Colloquium for Research

17 Introducing the UCSF De-Identified Clinical Data Warehouse (De-ID CDW) Clinical Data Colloquium for Research

18 De-Identified Clinical Data Warehouse (CDW) Starts with UCSF CDW (Caboodle) Removal of all Personal Health Information (PHI) Safe Harbor Approach Structured data only (not notes) Monthly Updates 18 Presentation Title and/or Sub Brand Name Here

19 What's in De-ID CDW vs. Existing RDB RDB Contained a subset of CDW data from mid Present Clinical Data for ~1,000,000 Patients De-ID CDW Contains all of the data that is in RDB plus: Historical data going back to 1982 (~3,000,000 Patients) Medication Administration & Dispensed Provider Data ED visit data Patient Registry data (Apex Registries) Surgical Episode detailed data (including Surg Procedures, supplies, ) And Much More Having a fully De-ID CDW allows users to do research against the full row level data set Clinical Data Colloquium for Research

20 When will it be available? Q It will eventually replace the Research Data Browser (RDB) Audit of the data to certify de-identification happening now UCSF policies for internal use and external sharing are being developed Clinical Data Colloquium for Research

21 Improvements in our Tools to use Data Clinical Data Colloquium for Research

22 MATLAB Site Wide License Success! MATLAB is a programming toolkit that is optimized for complex math, including such operations as Image Processing/Vision systems, Signal Processing, Statistics, Text Analysis, and Machine Learning. Now UCSF has unlimited User licenses (single user, class room, server) Additional Tool Boxes available Full Support from MathWorks Thanks to existing UCSF MATLAB Community, Library and IT Clinical Data Colloquium for Research

23 Self Service Analytics (Tableau) UCSF Site Wide License Dashboards & Visualizations Environment stood up Testing underway Training Materials Initial Availability at the end of the year Learn more at data.ucsf.edu Clinical Data Colloquium for Research

24 MyResearch What s New? UCSF secure hosting environment for sensitive data with web-based management, collaboration tools, and research software (SAS, Stata, R, Matlab, more) Significant Performance Improvements Hardware replacement completed. More CPU/RAM and Storage Microsoft Remote Desktop Services implemented to deliver applications faster and simpler. Introduced applications: STATA MP (Multiprocessor) All other applications constantly updated/patched. Grown to Studies LEARN MORE! Afternoon Breakout Track 1, Session Clinical Data Colloquium for Research

25 REDCap What s New in the last 12 months? REDCap (Research Electronic Data Capture) is a free secure web application for building and managing online surveys and databases. Three major releases Repeating Forms and Instruments External Modules: Enables the ability to extend REDCap functionality and customize base functionality Smart Variables: Dynamic variables that can be used in calculated fields, conditional/branching logic, and piping. Doubled the size of the ARS REDCap team for support! LEARN MORE! Afternoon Breakout Track 1, Session Clinical Data Colloquium for Research