Copyright 2013 Oracle and/or its affiliates. All rights reserved.

Size: px
Start display at page:

Download "Copyright 2013 Oracle and/or its affiliates. All rights reserved."

Transcription

1 1

2 The Value of Big Data and Analytics in Government Jan 22, 2014 Wayne Babby, Deputy Director (A), California Department of Corrections & Rehabilitation Tim Dexter, Solution Architect, Analytics Practice, Oracle 2

3 SAFE HARBOR STATEMENT The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle. 3

4 Program Agenda Big Data Success in Government Agencies CDCR s Road to Big Data CDCR Demonstration Question and Answers 4

5 Government Big Data Success Stories 5

6 Big Data in Public Sector Fraud Prevention Revenue Management Constituent Sentiment Threat Identification Economic Analysis Healthcare Regulatory Compliance, Licensing & Law Enforcement Open Government Maintenance & Utilities 6

7 Big Data in Healthcare Find relationship between gene to cancer interaction Use Case Cross-referenced the relationships between genes and five major cancer types across 20 million medical publication abstracts Simulated 900Million patient s genes and mirna Understanding the operations of these genes and the drugs that target them is expected to lead to better treatment 2012 Government Big Data Solution 7

8 Big Data for US Army Readiness Data-Driven Dashboard Seamlessly share force structure information across the Army enterprise Inform leaders on the status of their troops and resources, helping them make better decisions about who is to be deployed and when Access additional metrics and data for a clearer and even more comprehensive picture of their forces Enable users to search and analyze classified and unclassified, structured and unstructured information 8

9 Big Data for Auditing Health Insurance Exchange Big Data for Local Health Exchange Content Optimization and User Sentiment Scope and Setup Initially scoped for 1.5 million users 13 days from need to Solution Definition 28 days from PO to Production Solution Delivery Additional Use Cases Fraud Detection/Prevention User profiling for Intrusion Detection and enhanced security System performance and predictive modeling What if testing for exchange optimization payers/providers 9

10 Big Data for Cyber Security Nowhere for Attackers to Hide Capture & analyze every packet in and out of our networks Real time results 100 TB of data in 16 hours 20Gb per sec analytic rates in 1 rack Complimentary to existing cyber security solutions 10

11 CDCR s Road to Big Data 11

12 CDCR Organizational Factors Assembly Bill 109, the California Public Safety Realignment Act, became law on October 1, The law changed where people convicted of non-violent, non-serious and non-sex offenses serve their sentences. 3 Judge Panel overseeing reduction of CDCR Institution Population. Both have been key drivers in leveraging and managing data to make operational, program and policy decisions 12

13 CDCR Challenges Disparate Data Systems Localized Data Large Volumes of Data Lack of Data Governance 13

14 CDCR Opportunities New Enterprise Applications: Strategic Offender Management System (SOMS) Business Information System (BIS) Database Consolidation 14

15 Steps Used to Identify Use Cases Met with Programs/Divisions to identify data needs. Identified the sources of the data needed for each Business Case. Identified Business Problems solved thru making data available. Identified whether Business Case required Information Discovery or Business Intelligence Did we know what the question was? Prioritized Business Cases based on Organizational Value and Data Availability. 15

16 Steps to Identify Use Cases (cont d) 16

17 Why Information Discovery? How effective are the rehabilitation programs? Where should they be offered? How effectively are institution beds being used? Where can I find the offender to collect over due restitution? Where can I find the victim to pay the restitution? What is the current employee count? How many will retire in the next 6 months? DRP/DAPO DAI Victim Services HR 17

18 CDCR Business Needs and Benefits Increase time spent Data Analysis vs. Data Gathering Analyze Data from Multiple Systems without Modifying Source Systems Help Address Data Governance and Data Quality Issues 18

19 Key Steps at CDCR Data Sources Organizational Culture Roles Executive leadership Major Stakeholders Data Maturity Infrastructure Current State Future State Data Analytics, ETL Tools Roadmap Enterprise plan for multiple data sources Frequency of updates Historical data Data Governance 19

20 Start Simple and Evolve Phased Plan to Match Offender Needs to Services Offender Attributes Offender Needs Identification Offender Location Services Available Services Location Match Offender Needs to Services Top Offender Needs Geographic Concentration of Needs Statistics on Needs being met Statewide View of Services Future: Which Service Where 20

21 Next Steps for CDCR The Path to Data Maturity Establish Data Governance processes Mature Business Intelligence and make more available to appropriate users Establish Centralized Data Management function 21

22 Demonstration Abdul Shaik, BI/Reporting Manager California Department of Corrections & Rehabilitation 22

23 23

24 BACKUP 24

25 Adding Big Data to Your Existing Data Systems 25

26 Today s Data Architectures User Tools OLTP Systems Data Warehouse BI Tools and Dashboards Custom/Advanced Analytics 26

27 Acquiring and Using More & Varied Data Data Sources Big Data Ecosystem User Tools High Volume Distributed File System NoSQL DB Data Warehouse BI Tools and Dashboards Custom/Advanced Analytics 27

28 Data Sources Why Different Data Stores? Scales well to lots of data, very flexible Big Data Ecosystem Scales well to lots of users, very secure User Tools High Volume Distributed File System NoSQL DB Data Warehouse BI Tools and Dashboards Custom/Advanced Analytics 28

29 Combining Data Stores Data Sources Big Data Ecosystem User Tools High Volume Distributed File System NoSQL DB Data Warehouse BI Tools and Dashboards Custom/Advanced Analytics Map Reduce code moves valuable data into the secure store 29

30 Discovering Valuable Data Anywhere Data Sources Knowledge Discovery Engine Big Data Ecosystem User Tools Information Discovery High Volume Distributed File System NoSQL DB Data Warehouse BI Tools and Dashboards Custom/Advanced Analytics 30

31 Deploy Powerful Reporting and Analytics Data Sources Knowledge Discovery Engine Big Data Ecosystem User Tools Information Discovery High Volume Distributed File System NoSQL DB Data Warehouse BI Tools and Dashboards Custom/Advanced Analytics 31

32 Unified Information Architecture Data Sources Knowledge Discovery Engine Big Data Ecosystem User Tools Information Discovery High Volume Distributed File System NoSQL DB Data Warehouse BI Tools and Dashboards Custom/Advanced Analytics Real-Time Recommendations Machine Learning Algorithms 32

33 33

34 Big Data History Hadoop (Map/Reduce) Map Reduce Theory Paper Published 2004 by Google Ingest and search large data sets Hadoop Doug Cutting, Cloudera (Yahoo) Lucene (1999) indexing large files Nutch (2004) search massive data Hadoop (2007) Basic Composition: Mapper Reducer Hadoop File System (HDFS) 34

35 Big Data Tools Hadoop (Map/Reduce) 35

36 Structured vs. Unstructured Data Semantic Graphs Big Data: Decisions based on all your data Geospatial Social Data Documents Video Machine-Generated/ Sensor Data Imagery 36

37 All Data for All Users Different users need different tools Advanced Hadoop Engineers Government Case Workers Agency Analysts Third-party Researchers How do we get the right data to the right person at the right time? 37

38 Big Data Connectors Analyzing Data With Familiar Tools R SQL Expand the data pool for analytics leveraging Hadoop Use the full power of Oracle SQL on all data Dynamically leverage Hadoop to execute R analytics Move valuable data from the cluster to the RDBMS at extreme speeds Hadoop IB Oracle Database 38

39 Data in Action DECIDE ANALYZE ACQUIRE ORGANIZE Make Better Decisions Using All Your Data 39