Modernizing Data Integration

Similar documents
Governing Big Data and Hadoop

Developing a Strategy for Advancing Faster with Big Data Analytics

Modernizing Data Integration

PORTFOLIO AND TECHNOLOGY DIRECTION ARMISTEAD SAPP & RANDY GUARD

Big Data Management Best Practices for Data Lakes Philip Russom, Ph.D.

Data Integration for Data Warehousing and Data Migrations. Philip Russom Senior Manager, TDWI Research March 29, 2010

Take a Dive into the Data Lake

Exploring the Benefits of the Modernized Data Warehouse Philip Russom

Emerging Technologies Innovations and Evolutions in BI, Analytics, and Data Warehousing

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

SAS FORUM RUSSIA Welcome

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

Cognitive Data Warehouse and Analytics

KnowledgeENTERPRISE FAST TRACK YOUR ACCESS TO BIG DATA WITH ANGOSS ADVANCED ANALYTICS ON SPARK. Advanced Analytics on Spark BROCHURE

DLT AnalyticsStack. Powering big data, analytics and data science strategies for government agencies

Datametica. The Modern Data Platform Enterprise Data Hub Implementations. Why is workload moving to Cloud

Architecture Overview for Data Analytics Deployments

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

5th Annual. Cloudera, Inc. All rights reserved.

Modern Analytics Architecture

Make Business Intelligence Work on Big Data

How In-Memory Computing can Maximize the Performance of Modern Payments

Microsoft Big Data. Solution Brief

Advancing Information Management and Analysis with Entity Resolution. Whitepaper ADVANCING INFORMATION MANAGEMENT AND ANALYSIS WITH ENTITY RESOLUTION

Operational Hadoop and the Lambda Architecture for Streaming Data

Building a Data Lake on AWS

Five Advances in Analytics

MapR Pentaho Business Solutions

Building a Data Lake on AWS EBOOK: BUILDING A DATA LAKE ON AWS 1

EXECUTIVE BRIEF. Successful Data Warehouse Approaches to Meet Today s Analytics Demands. In this Paper

USING R IN SAS ENTERPRISE MINER EDMONTON USER GROUP

Architecture Optimization for the new Data Warehouse. Cloudera, Inc. All rights reserved.

THIS ADDENDUM IS FOR THE PURPOSE OF MAKING THE FOLLOWING CHANGES OR CLARIFICATIONS

Confidential

Big Data Analytics met Hadoop

ActualTests.C Q&A C Foundations of IBM Big Data & Analytics Architecture V1

SAS Viya. Примеры проектов на новой платформе. Copyright SAS Institute Inc. All rights reserved.


Optimizing Outcomes in a Connected World: Turning information into insights

Datametica DAMA. The Modern Data Platform Enterprise Data Hub Implementations. What is happening with Hadoop Why is workload moving to Cloud

Microsoft Azure Essentials

How Data Science is Changing the Way Companies Do Business Colin White

Augmented Real-time Clinical DataMart. Phani S Srinivasan Ponnapalli, Syneos Health Subrahmanyam Rayaprolu, Syneos Health

Modernizing Data Warehouse Infrastructure

Data Governance and Data Quality. Stewardship

Bringing the Power of SAS to Hadoop Title

Cognizant BigFrame Fast, Secure Legacy Migration

TechValidate Survey Report. Converged Data Platform Key to Competitive Advantage

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

A NON-GEEK S BIG DATA CHEAT SHEET: FIVE QUESTIONS FOR SAVVY TECHNOLOGY LEADERS

TAP Air Portugal. in Real Time TÍTULO. Subtítulo. Rui Monteiro - February 19. Data da apresentação

Sr. Sergio Rodríguez de Guzmán CTO PUE

Analytics in Action transforming the way we use and consume information

SAS Life Science Analytics Framework

Architecting an Open Data Lake for the Enterprise

IBM Db2 Warehouse. Hybrid data warehousing using a software-defined environment in a private cloud. The evolution of the data warehouse

Copyright - Diyotta, Inc. - All Rights Reserved. Page 2

Embracing Big Data. CMU Data Analytics Conference September 22, Lonnie Miller Principal Industry Consultant, SAS

Ensuring Trust in Big Data with SAP EIM Solutions. Scott Barrett Senior Director, Information Management Database & Technology Centre of Excellence

Information Builders Enterprise Information Management Solution Transforming data into business value Fateh NAILI Enterprise Solutions Manager

Trifacta Data Wrangling for Hadoop: Accelerating Business Adoption While Ensuring Security & Governance

ETL on Hadoop What is Required

What is Next for ECM in Age of Digital Disruption

Pentaho Technical Overview. Max Felber Solution Engineer September 22, 2016

Big Data Platform Implementation

Louis Bodine IBM STG WW BAO Tiger Team Leader

Microsoft reinvents sales processing and financial reporting with Azure

THE DATA WAREHOUSE EVOLVED: A FOUNDATION FOR ANALYTICAL EXCELLENCE

Pentaho 8.0 and Beyond. Matt Howard Pentaho Sr. Director of Product Management, Hitachi Vantara

Managing Data Warehouse Growth in the New Era of Big Data

Mass-Scale, Automated Machine Learning and Model Deployment Using SAS Factory Miner and SAS Decision Manager

Common Customer Use Cases in FSI

Cloud Data Integration and Data Quality: Extending the Informatica Platform to the Cloud

GET MORE VALUE OUT OF BIG DATA

Amsterdam. (technical) Updates & demonstration. Robert Voermans Governance architect

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica

Vertical Edge Consulting Group

Building a Single Source of Truth across the Enterprise An Integrated Solution

Introduction to Stream Processing

Big Data Cloud. Simple, Secure, Integrated and Performant Big Data Platform for the Cloud

IBM Software IBM Business Process Manager

Redefine Big Data: EMC Data Lake in Action. Andrea Prosperi Systems Engineer

Oracle 全数据平台解决方案 : 打破技术壁垒, 释放数据能量. Sally Piao 甲骨文公司全球研发副总裁

Actian DataConnect 11

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Business Insight and Big Data Maturity in 2014

The Intersection of Big Data and DB2

Emerging Business Applications of High Performance Analytics

White Paper. Checklist For Achieving BI Agility: How To Create An Agile BI Environment

InfoSphere Software The Value of Trusted Information IBM Corporation

Meta-Managed Data Exploration Framework and Architecture

The Importance of good data management and Power BI

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia

: Boosting Business Returns with Faster and Smarter Data Lakes

Table of Contents. Are You Ready for Digital Transformation? page 04. Take Advantage of This Big Data Opportunity with Cisco and Hortonworks page 06

ENTER THE FAST LANE WITH AN AI-DRIVEN INTELLIGENT STREAMING PLATFORM

TIBCO Data & Analytics Overview

Introducing Amazon Kinesis Managed Service for Real-time Big Data Processing

Evolution to Revolution: Big Data 2.0

Processing Big Data with Pentaho. Rakesh Saha Pentaho Senior Product Manager, Hitachi Vantara

Transcription:

Modernizing Data Integration To Accommodate New Big Data and New Business Requirements Philip Russom Research Director for Data Management, TDWI December 16, 2015

Sponsor

Speakers Philip Russom TDWI Research Director, Data Management Ron Agresta Principal Product Manager, Data Management, SAS 3

New Checklist Report from TDWI on Data Integration Modernization The report discusses common modernizations users are applying to data integration programs today. In this webinar, we ll discuss some of the report s findings. Stay tuned, to learn how to get a free copy of the report.

Agenda Background Defining Data Integration (DI) Modernization Technology and Business Drivers High-Priority DI Modernization Tasks 1. Multiple data ingestion techniques 2. Agile data prep 3. Self-service data access 4. New data platform types 5. Right-time data movement 6. Non-traditional data 7. Integrated tool platforms Concluding Summary PLEASE TWEET @prussom, #TDWI, @SASDataMGMT, #BigData, #Analytics, #DataIntegration

DEFINING Data Integration Modernization Upgrades Newer versions of current data integration software and other middleware Bigger and faster hardware Additions to existing data integration solutions New data sources, transforms, targets, etc. More server instances, nodes, storage Use more functions in your existing DI tools Move from exclusively batch to more diverse interfaces and processing Turn on real-time functions for federation, virtualization, replication Turn on event processing to embrace streaming data Turn on text analytics to embrace unstructured data Acquire new specialized tools to complement the old ones Wide range of natural language processing (NLP) tools Native tools for Hadoop or other new environments Rip and Replace A few users may modernize by migrating to a different toolset

Big Data Drivers for Data Integration Modernization New big data sources New business analytics New data integration techniques Old and new are coexisting

NUMBER ONE Complement the high latency of older DI practices with a broader range of data ingestion techniques. Data ingestion is How, where, and how frequently data entering an environment is landed or loaded into targets Some new sources of data generate data frequently Business practices requiring fresh data continue to grow. Data ingestion practices need many speeds and frequencies. Repurposing data is more and more being done on the fly, at run time, instead of prior to load time. Many functions support varied data ingestion Event stream processing, data federation, self-service data access, data prep, micro batch, etc.

NUMBER TWO Embrace the new practices and tools of data prep, for agility, speed, simplicity, and ease of use. Data prep (short for data preparation ) is DI Light, as a subset of DI functionality, trimmed down for usability and performance Synonyms: data wrangling, munging, blending Data prep functions are built into many tool types: Data integration, quality, profiling Data exploration, visualization, analytics Data prep complements traditional data mgt Data prep originated for data exploration and discovery oriented analytics Permanent designs or highly accurate reports still require in-depth traditional data preparation

NUMBER THREE Integrate data in ways that enable self-service access to new big data for a wide range of users. Self-service data access functions are important They give data workers spontaneity, speed, agility, autonomy TDWI Survey identified top self-service tasks users want Data discovery, viz, dashboard authoring, data prep Modern DI integrates data specifically for self-service access Data warehouses and marts are still relevant But new big data may require new database types: data lakes, vaults, enterprise data hubs, maybe on Hadoop Depend on special tool functions or characteristics for self-service data access Ease of use, biz friendly data views, data prep

NUMBER FOUR Modernize your data integration infrastructure by leveraging new data platform types like Hadoop. Hadoop is an effective data landing area for many feed speeds and data types. Hadoop is a scalable data staging area. Hadoop is also suited to data archiving. Hadoop scales with push-down processing. Hadoop can offload your DI platform or hub. Other relatively new platforms Those based on columns, appliances, NoSQL, open source, etc.

NUMBER FIVE Keep adding more right-time functions as you modernize your data integration solutions. New practices discussed earlier demand right-time DI: Data ingestion assumes multiple DI speeds frequencies Data prep tends to near-time federation & micro batch Data exploration assumes immediate response for user Many right-time DI functions are available today: High performance (for fast extracts & loads), micro batch (running frequently during day), data federation (for time-sensitive metrics) Many can be configured to run at multiple right-time speeds; data replication & changed data capture Millisecond real time; streaming, event processing DI Modernization often involves using more of above

NUMBER SIX Modernize your data integration functionality, for business value from non-traditional data. Non-traditional data is Anything that s not relational or other structured data Unstructured from human language text to video Semi-structured hierarchies in JSON or XML Multi-structured a mix of the above For biz value from non-traditional data, modernize 5 layers of DI: Capture Storage Processing Structure Metadata

NUMBER SEVEN Consider modernizing your DI tool portfolio with an integrated platform of multiple data mgt tools. Defining the DI integrated platform DI and/or DQ tool at its heart, plus tools for MDM, metadata mgt, stewardship, governance, CDC, replication, event processing, data services, data profiling, data monitoring, etc. Not just a suite. All tools are integrated by sharing metadata, biz rules, master data, development artifacts, collaborative functions Strongest trend in data integration tools, by both users & vendors Away from separate best-of-breed tools toward a unified toolset Practical reasons for using a unified DI platform Greater collaboration among multiple DI/DM developers and others Single DM solutions that combine multiple DM capabilities Most of the traditional and big data functionality mentioned today in one integrated platform

CONCLUDING SUMMARY Data Integration Modernization Multiple data ingestion techniques Agile data prep Self-service data access New data platform types Right-time data movement Non-traditional data Integrated tool platforms

Download a free copy of the TDWI Checklist Report about Data Integration Modernization Download the report in a PDF file at: bit.ly/dataintmod

DATA INTEGRATION MODERNIZATION WITH SAS Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

PLATFORM SAS DATA MANAGEMENT Ingestion Data Prep Self-Service Hadoop Right-Time DI New Data Integrated Platform SAS delivers a complete, integrated platform for data access, quality, integration, management, transformation, monitoring, mastering, and governance across a wide range of use cases. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

IN-HADOOP SAS DATA LOADER FOR HADOOP Ingestion Data Prep Self-Service Hadoop Right-Time DI New Data Integrated Platform Integrated environment for self-service data preparation with data profiling, data quality, data transformation, and code execution actions processed directly in Hadoop Business user oriented web application with a guided workflow experience Automatic optimization that uses most appropriate run-time execution available Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

IN-STREAM SAS EVENT STREAM PROCESSING Ingestion Data Prep Self-Service Hadoop Right-Time DI New Data Integrated Platform Enables processing on huge volumes of streaming data flowing at very high rates with very low latency Delivers in-stream advanced analytics, decisions, and data quality transformations Supports varied use cases such as clickstream analysis, IoT sensor analysis, decision management, fraud detection, and risk monitoring Streaming Events Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

IN-FEDERATED VIEW SAS FEDERATION SERVER Ingestion Data Prep Self-Service Hadoop Right-Time DI New Data Integrated Platform Federated view building application that creates dynamic views of heterogeneous data and is made available to other systems through ODBC, JDBC, or web services Supports data masking, caching and in-view data quality transformations Offers table, row, and column level data access controls Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

IN-DATABASE SAS IN-DATABASE TECHNOLOGIES Ingestion Data Prep Self-Service Hadoop Right-Time DI New Data Integrated Platform Data transformation, data quality processing, and analytics performed directly in database or in Hadoop Data Quality Accelerators move power of SAS data quality algorithms to the data taking advantage of database parallel computing capabilities Embedded processing can be invoked from a number of different execution environments Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

SAS WANT TO LEARN MORE? Learn more about SAS Data Management http://sas.com/data Join the SAS Data Management Community https://communities.sas.com/ Follow us on Twitter: @sasdatamgmt Like us on Facebook: SAS Software Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

Questions? 24

Contact Information If you have further questions or comments: Philip Russom, TDWI prussom@tdwi.org Ron Agresta, SAS ron.agresta@sas.com 25