Empowering the Data-Driven Organization Jeroen Dijkxhoorn, SAS Lars Slagboom, ABN AMRO
In 5 years from now Elephants will rule the world
Acting on predictive Decisions will be standard
Real Time Analytics is to blame for a crash
Mobile User Interfacing will be the Standard
Data will be everywhere and Nobody knows where exactly
Trends Big Data, Storage, Hadoop & In-memory Technology Technology Push: storage costs and CPU speed Cost of Storage, Memory, Computing In 2000 a GB of Disk $17 today < $0.07 In 2000 a GB of Ram $1800 today < $10 In 2009 a TB of RDBMS was $70K today < $ 20K Hadoop Microsoft PDW Cost per Terabyte Oracle Greenplum Teradata Vertica $- $20.000 $40.000 $60.000 $80.000 $100.000 Today 2009
To enable analytics in this changing environment, you need to: Bring the Analytics to the Data and run it in a distributed mode
Business pull: two Eras...two mindsets Technology constrained Process-centric Focus on cost control Everything is forbidden unless it is permitted Technology empowered Discovery-centric Focus on value Everything is permitted unless it is forbidden
To enable analytics in this changing environment, you need to: Provide self-service analytic capabilities and automate the decision making process
Data-Driven with Analytics as the main enabler
DEPLOY & MONITOR From Data to Decision MANAGE DATA Challenges: Growth in Demand Growth of Data DATA EXPLORE Needs: Scale the Process Avoid Replication Access to Talent TEXT Increase Productivity Controlling Cost Decouple Cost & Growth DEVELOP MODELS
SAS Directions to address these needs 1 Scale the Process SPEED UP THE DATA TO DECISION LIFECYCLE 1. Event Stream Processing 2. High Performance Analytics 3. Decision Management 2 3 4 Avoid Replication Increase Productivity Decouple Cost & Growth MOVE SAS PROCESSING TO THE DATA 1. In-Database Processing 2. Scoring Accelerators 3. Code Accelerators PROVIDE INTERACTIVE, SELF-SERVICE INTERFACES 1. Data Loader for Hadoop 2. Visual Analytics, Visual Statistics & In-Memory Statistics 3. Move to responsive web-apps based on HTML5 SUPPORT IT COST EFFICIENCY EFFORTS 1. Span data and processing across a Grid or Cluster 2. Virtual Apps to deploy in Private, Public or Hybrid Cloud 3. On-premise deployment within 3 hours
Platform Strategy, Automotive Engineering 19 models on a single platform 15 billion annual savings 30% production time
Platform strategy: Basis of the Analytics Factory Partners Fraud 50% reduction in costs for BI/Analytics Double the value ofbi/analytics projects IT Purchasing Risk per year Marketing Controlling Sales Logistics Production
3 steps towards an Analytics Factory Standardization Consolidation Industrialization
3 steps towards an Analytics Factory Standardization Coming together by agreeing what capabilities to use Consolidation Keeping together by centralizing the platform Industrialization Working together by scaling and speeding up the process
Data en Informatie bij ABN AMRO
Introductie ABN AMRO Enterprise Data & Information 22
Standardization Consolidation Industrialization 23
Standardization Kenmerken Focus op systeemlandschap Iedereen zijn eigen voorkeur Data decentraal Succesfactoren Externe druk Bedrijfsbreed thema Beleid Standardization 24
Consolidation Kenmerken Focus naar gebruiker Waarde van geïntegreerde data wordt onderkent Wachttijden in je datawarehouse ontwikkeling Succesfactoren Introductie gebruikersteams Vermarkt je datawarehouse en BI omgeving Consolidation 25
Industrialization Kenmerken Focus op gebruik Snellere groei van data dan systemen Meer vraag dan aanbod Data is een keten Succesfactoren Businessprocessen meenemen in je verandering Organiseer bronsystemen Industrialization 26
Marc Lammers: 50 keer 2% is ook 100%
Back to the elephant
DEPLOY & MONITOR Where is Hadoop being used for? Hadoop as a Data Platform Hadoop as a core component of next generation analytical platform MANAGE DATA TEXT DATA EXPLORE DEVELOP MODELS
Usage 1: Hadoop as Data Platform Initiator This paradigm is mostly driven by IT Drivers Increasing costs of data storage Increasing volume of data Latency to deliver information Benefits Large-scale distributed storage and batch processing
Usage 1: Hadoop as data platform SAS/ACCESS SAS Data Management SAS Event Stream Processing SAS Federation Server Ingest/Load Data SAS Data Loader for Hadoop SAS Data Quality Accelerator for Hadoop SAS Code Accelerator for Hadoop Metadata Documentation Cleanse & Transform Data SAS Metadata Server Load Data To Other Sources / Memory SAS/ACCESS SAS Data Management SAS Federation Server
DEPLOY & MONITOR Usage 2: Hadoop as core of next generation analytical platform Initiator This paradigm is mostly driven by business MANAGE DATA TEXT DEVELOP MODELS DATA EXPLORE Drivers Increasing question to a variety of different and additional information The need for a flexible data platform to store, process, and analyze data at any scale Benefits The business can start thinking big again when it comes to data
DEPLOY & MONITOR Usage 2: Hadoop as core of next generation analytical platform SAS/ACCESS SAS Data Management SAS Event Stream Processing SAS Federation Server SAS Data Loader for Hadoop MANAGE SAS Data Quality Accelerator for DATA Hadoop SAS Code Accelerator for Hadoop SAS Visual Analytics SAS In-memory Statistics for Hadoop TEXT DATA EXPLORE SAS Decision Manager SAS Scoring Accelerator for Hadoop DEVELOP MODELS SAS HPA Products SAS Visual Statistics SAS In-memory Statistics for Hadoop
Patterns of using SAS with Hadoop for Analytics & reporting Extract from Hadoop pushing some SAS pre-processing to Hadoop Embedded Process - Push SAS data processing to Hadoop with Map Reduce In-Memory Analytics - Use Hadoop for Storage persistence and commodity computing. SAS with Hadoop SAS in Hadoop SAS on Hadoop Hive Impala Score A Code A HPA LASR
SAS for Hadoop directions Continuity of Business DIRECTIONAL THEMES Bring SAS processing to the Data Leverage Hadoop for new Technology offerings Breadth and depth of modern analytic methods in Hadoop
Information on breakouts Analytical platform 13.30 Parallel Sessions Big Data and Visual Analytics Rabobank Business Analytics SAS Data Management Ziekenhuis Gelderse Vallei Visual Analytics Mercachem 13.30 Guided Tours Visual Analytics 14.30 What s Hot Sessions Big Data Analytics met Hadoop Data Management 3.0: What about Hadoop? What s hot in Data Governance Modernisatie: meer mogelijkheden, minder risico s Geavanceerd modelleren met SAS What s new in SAS Visual Analytics 7.1 Best Practices in Visualisatie en Dashboard design 15.45 Parallel Sessions Big Data and Visual Analytics Belastingdienst Business Analytics ibridge/ Randstad Data management DSM Visual Analytics H@nd 14.30 Roundtables (max 20 pers.) The Analytical Bank Data monetization