Evolution to Revolution: Big Data 2.0

Similar documents
EMA Radar for Application Discovery and Dependency Mapping (ADDM): Q AppEnsure Profile

EMA INNOVATORS VMWORLD 2017 TOP 3. An Enterprise Management Associates Research Report. Written by Torsten Volk Q3 2017

What Is the Future of IT Service Management?

IBM Analytics Unleash the power of data with Apache Spark

Datametica. The Modern Data Platform Enterprise Data Hub Implementations. Why is workload moving to Cloud


Advancing Information Management and Analysis with Entity Resolution. Whitepaper ADVANCING INFORMATION MANAGEMENT AND ANALYSIS WITH ENTITY RESOLUTION

Managing and Optimizing Your SaaS Investments: An EMA Analysis

Analytics in Action transforming the way we use and consume information

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

DLT AnalyticsStack. Powering big data, analytics and data science strategies for government agencies

Microsoft Azure Essentials

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11

Investor Presentation. Fourth Quarter 2015

Investor Presentation. Second Quarter 2016

Enabling Self-Service BI Success: TimeXtender s Discovery Hub Bridges the Gap Between Business and IT

Bringing the Power of SAS to Hadoop Title

Copyright - Diyotta, Inc. - All Rights Reserved. Page 2

How Data Science is Changing the Way Companies Do Business Colin White

Unifying End-User, Network, and Application Performance Monitoring and Management

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

NICE Customer Engagement Analytics - Architecture Whitepaper

Got Hadoop? Whitepaper: Hadoop and EXASOL - a perfect combination for processing, storing and analyzing big data volumes

Ways to Transform. Big Data Analytics into Big Value

Cognitive Data Warehouse and Analytics

Blueprints for Big Data Success. Succeeding with four common scenarios

5th Annual. Cloudera, Inc. All rights reserved.

Workload Automation:

Louis Bodine IBM STG WW BAO Tiger Team Leader

DATA HUB: A MODERN VISION FOR STORAGE

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

Realising Value from Data

Who is Databricks? Today, hundreds of organizations around the world use Databricks to build and power their production Spark applications.

A Forrester Consulting Thought Leadership Paper Commissioned By HPE. August 2016

Blueprints for Big Data Success

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

Common Customer Use Cases in FSI

Datametica DAMA. The Modern Data Platform Enterprise Data Hub Implementations. What is happening with Hadoop Why is workload moving to Cloud

Luxoft and the Internet of Things

Predictive Analytics Reimagined for the Digital Enterprise

PORTFOLIO AND TECHNOLOGY DIRECTION ARMISTEAD SAPP & RANDY GUARD

OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT

How In-Memory Computing can Maximize the Performance of Modern Payments

Data Lake or Data Swamp?

Simplifying Your Modern Data Architecture Footprint

PERSPECTIVE. Monetize Data

From Data Deluge to Intelligent Data

Datameer for Data Preparation: Empowering Your Business Analysts

Teradata IntelliSphere

Oracle Big Data Discovery The Visual Face of Big Data

EXECUTIVE BRIEF. Successful Data Warehouse Approaches to Meet Today s Analytics Demands. In this Paper

Accelerating Your Big Data Analytics. Jeff Healey, Director Product Marketing, HPE Vertica

Business is being transformed by three trends

Data-Centric Innovation How customers are building competitive advantage around data Martin Guther VP Digital Enterprise Platform, SAP

Analytics in the Cloud

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia

THE DATA WAREHOUSE EVOLVED: A FOUNDATION FOR ANALYTICAL EXCELLENCE

SUSiEtec The Application Ready IoT Framework. Create your path to digitalization while predictively addressing your business needs

Managing explosion of data. Cloudera, Inc. All rights reserved.

USING BIG DATA AND ANALYTICS TO UNLOCK INSIGHTS

BIG DATA TRANSFORMS BUSINESS. Copyright 2013 EMC Corporation. All rights reserved.

Cask Data Application Platform (CDAP) Extensions

SAP Cloud Platform Big Data Services EXTERNAL. SAP Cloud Platform Big Data Services From Data to Insight

Amsterdam. (technical) Updates & demonstration. Robert Voermans Governance architect

Spotlight Sessions. Nik Rouda. Director of Product Marketing Cloudera, Inc. All rights reserved. 1

GET MORE VALUE OUT OF BIG DATA

WebFOCUS: Business Intelligence and Analytics Platform

Architected Blended Big Data With Pentaho. A Solution Brief

Big Data The Big Story

Oracle Big Data Discovery Cloud Service

IBM Analytics. Data science is a team sport. Do you have the skills to be a team player?

WELCOME TO. Cloud Data Services: The Art of the Possible

Cognizant BigFrame Fast, Secure Legacy Migration

Jason Virtue Business Intelligence Technical Professional

Five Advances in Analytics

Hortonworks Connected Data Platforms

Progressive Organization PERSPECTIVE

Getting Big Value from Big Data

From Information to Insight: The Big Value of Big Data. Faire Ann Co Marketing Manager, Information Management Software, ASEAN

The Evolution of Data and the Impact of New Technologies on Agency Finance & Procurement

SAP Big Data. Markus Tempel SAP Big Data and Cloud Analytics Services

Why Machine Learning for Enterprise IT Operations

Top 3 Strategies for Modernizing Enterprise Data Management C L O U D A N A L Y T I C S D I G I T A L S E C U R I T Y

Responsive enterprise the future of the enterprise PERSPECTIVE

Data Integration for the Real-Time Enterprise

IBM Db2 Warehouse. Hybrid data warehousing using a software-defined environment in a private cloud. The evolution of the data warehouse

The Road to Becoming a Visionary Big Data Analytics Organization

Hybrid Data Management

Nouvelle Génération de l infrastructure Data Warehouse et d Analyses

Building data-driven applications with SAP Data Hub and Amazon Web Services

Efficiently Develop Powerful Apps for An Intelligent Enterprise

Government Business Intelligence

Big and Fast Data: The Path To New Business Value

Emerging Business Applications of High Performance Analytics

Next Generation Services for Digital Transformation: An Enterprise Guide for Prioritization

Transforming Analytics with Cloudera Data Science WorkBench

Insights-Driven Operations with SAP HANA and Cloudera Enterprise

Developing a Strategy for Advancing Faster with Big Data Analytics

InfoSphere Software The Value of Trusted Information IBM Corporation

This document (including, without limitation, any product roadmap or statement of direction data) illustrates the planned testing, release and

Transcription:

Evolution to Revolution: Big Data 2.0 An ENTERPRISE MANAGEMENT ASSOCIATES (EMA ) White Paper Prepared for Actian March 2014 IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING

Table of Contents Executive Summary... 1 Big Data is Maturing Fast... 1 Drivers of Change... 1 Evolution to Revolution... 2 Hybrid Data Ecosystems and Big Data 2.0... 3 Orchestration and Integration... 7 EMA Perspective... 7 About Actian... 8

Executive Summary The evolution and innovation surrounding Big Data is evolving quickly. Industry research indicates a new level of sophistication is required to meet these needs. Big Data 2.0 has arrived and early adopters of Big Data 1.0 strategies are challenged by poorly integrated traditional systems that are inflexible and difficult to manage. The Big Data landscape continues to shift towards more sophisticated workloads that go beyond simple analytics towards operational processes that drive deep businesses value. Diverse data sources and real-time demands are changing traditional architectures to include an array of purpose-built platforms presenting new opportunities and challenges. Big Data 2.0 has arrived and early adopters of Big Data 1.0 strategies are challenged. Big Data is Maturing Fast Innovation is a constant in the area of data management and analytics. Dating back to the 1970s when E. F. Codd created relational databases all the way to the innovative team at Yahoo who recently brought us Hadoop. It seems that in a blink of an eye technological advancements are driving our Big Data and analytics strategies further and faster than we initially imagined. This evolution is driven by a variety of trends all of which create a perfect storm of challenges and opportunities for innovative companies. Drivers of Change Big Data adoption is spurred on by four major technical trends and it s causing the industry to evolve at faster rate than many of us believed possible. These four trends are moving technology forward while opening the door for greater insight and innovation around enterprise data. Maturing User Communities have created a demand for more sophisticated and complex utilization of enterprise data. Highly complex workloads are the norm and traditional systems and architectures are challenged to meet these evolving needs. The democratization of data driven insights is empowering a wider user base by including line of business executives in the discussion and value proposition surrounding Big Data. Finance, Marketing and Sales are sponsoring Big Data projects nearly as fast as IT organizations. New Technologies Innovative technologies, MPP environments, columnar databases, flash drives, in-memory computing, Hadoop and NoSQL databases are all contributing to the technology surge that is powering Big Data and its possibilities. Technology is allowing us to execute on workloads that were once impractical from a time and resources standpoint. Economics The capital costs of working with vast data sets has dropped significantly over the past few years. Many areas of our analytic infrastructure are benefiting from commoditization. Servers, memory and disks are all less expensive than ever, allowing us to do more with less. Many of the new Big Data frameworks are based on open source technology creating a lower financial barrier to adoption. Valuable Data Types New and valuable data types have caught the imagination of companies who see a competitive edge in leveraging machine, sensor, appstream and social data to open new avenues of insight and execution for their companies. The Internet of Things is driving innovation and creating a flood of new data to our businesses. At the same time Big Data is supplying us with the tools to tap into unstructured enterprise information we were once forced to ignore due to the cost or lack of technology. As Big Data resources evolve companies are addressing the opportunity that these data types can deliver. 1 Page 1

Evolution to Revolution These four trends act as catalysts for early adoption of Big Data projects. Research executed by EMA in its 2012 Big Data Comes of Age 1 research report illustrated how early projects were being implemented. Early adaptors of Big Data focused on access to internal and external multistructured data sets as their number one ranked technical driver to implement projects while 51% of respondents stated that their primary use case for Big Data was Online Archiving. Both of these data points illustrate how early stages of Big Data strategies were focused on wrangling information and working to leverage it. 45% of respondents ranked staging structured data as the second most popular use case. Data from EMA research shows that analytic workloads are a primary goal of companies looking to leverage Big Data and execute sophisticated analysis. Complex operational workloads are quickly becoming the norm as Big Data strategies mature. Early stage projects opened the door for companies to experiment and address entry-level Big Data opportunities. These projects faced challenges from multiple directions. 41% of EMA research respondents indicated lack of skills to manage multi-structured data platforms such as Hadoop as a leading deterrent to their overall success. 44% of respondents planned to address the skill gap issue through internal training of staff a time consuming and expensive task. Adding new platforms to an already complex data management landscape makes it difficult to orchestrate data and workloads. Implementing projects across these platforms demands a higher level of integration between solutions that most Big Data version 1.0 ecosystems don t have. Overcoming a skill gap and adopting new technologies is difficult under the best of circumstances. As early projects gave way to next level initiatives new challenges surfaced for companies adopting Big Data. There are significant trends from one year to the next as Big Data 1.0 projects accelerate to a more sophisticated set of requirements. In the 2013 EMA Big Data research, Operationalizing the Buzz: Big Data 2013 2, it became clear that a shift is taking place in the Big Data landscape and several themes have emerged that are driving Big Data to the next level. Complex operational workloads are driving greater value in Big Data projects. Real-time data demands have overshadowed batch style data. Sophisticated Big Data projects require diverse data sources. Companies are utilizing a multiple platforms to execute complex workloads. Complex operational workloads are quickly becoming the norm as Big Data strategies mature. In short Big Data has evolved to a mission-critical technology for enterprise companies. Data from 2013 EMA research demonstrates this shift in multiple ways. After surveying 600 active Big Data projects the most popular workloads are Fraud Analysis/Risk Management, CRM and Asset Optimization. Each of these project types is operational in nature, complex, real-time driven, includes diverse data assets, and reaches beyond a Hadoop only environment to leverage traditional platforms. 1 Big Data Comes of Age, EMA and 9Sight Consulting, November 2012. http://www.enterprisemanagement.com/research/ asset.php/2409/big-data-comes-of-age 2 Operationalizing the Buzz: Big Data 2013, EMA and 9Sight Consulting, November 2012. http://www. enterprisemanagement.com/research/asset.php/2641/operationalizing-the-buzz:-big-data-2013 2 Page 2

2013 Project Challenge Fraud Analysis, Liquidity Risk Assessment (e.g., risk management) Customer Relations Management (e.g., ad-hoc operational queries) Staff Scheduling, Logistical Asset Planning (e.g., asset optimization) Billing, Rating (e.g., operational event and policy processing) Campaign Optimization, Market Basket Analysis, Cross-sell/Up-sell Recommendation Grouping and Relationship Analysis, Geographic Optimization (e.g. clustering, social graph) Point of Sale, Customer Care (e.g., operational transaction processing) 13.1% 12.6% 11.7% 11.2% 10.6% 10.1% 9.9% Sentiment Analysis, Opinion Mining (e.g., natural language processing, text analytics) Social Brand Management Analysis (e.g., event processing with text analytics) 7.5% 7.2% Path Analysis, Customer churn (e.g., behavioral analysis) 6.2% 0% 2% 4% 6% 8% 10% 12% 14% Percentage of Projects Figure 1: Big Data projects by type from EMA Operationalizing the Buzz: Big Data 2013 research. To further make the case for maturity in Big Data, EMA research identified new focus on speed requirements from the 2013 research respondents. Technical and business drivers behind Big Data projects aligned across this topic. Respondents identified requirements for faster analytical or transactional processing of structured and unstructured data sets (54%) along with the need to react faster to real-time streaming data souces (51%) as the top drivers for Big Data projects. At the same time respondents selected faster response time for operational and analytical workloads as the primary business driver behind Big Data projects. It s not often that IT/Technical drivers and business drivers align this well. The need for greater speed supports the findings that operational workloads are gaining prominence and overall project complexity is growing. Hybrid Data Ecosystems and Big Data 2.0 As Big Data 1.0 gives way to Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies. At the core of these strategies or best practices for Big Data is a paradigm shift away from a centralized enterprise data warehouse as the central data source for business intelligence and analytics to a more diverse landscape of data driven platforms. This Hybrid Data Ecosystem (HDE) is focused on matching data types and workloads with the best posible platform to meet the needs of the enterprise or a specific project. Every company s ecosytem will be somewhat unique in make up but it will share commonality of requirements, management, integration, platforms, workloads and users. Big Data 2.0 organizations are faced with new data, new users, new workloads and new complex strategies. 3 Page 3

Line of Business Executives OPERATIONAL ANALYTICS Business Analysts Data Mart (DM) BI Analysts Data Scientists ANALYTICS Analytical Platform (ADBMS) Enterprise Data Warehouse (EDW) INFORMATION MANAGEMENT ECONOMICS LOAD COMPLEX WORKLOAD STRUCTURE REQUIREMENTS RESPONSE DATA INTEGRATION Discovery Platform Cloud Data OPERATIONAL PROCESSING External Users Hadoop SQL Operational Systems NoSQL EXPLORATION Developers IT Analysts Hybrid Data Ecosystems add power and agility to a companies analytic landscape. At the same time it can add complexity and new challenges. When choosing platforms it is important to investigate how well they will integrate and work with the other solutions your company has invested in. Leading vendors in this space are working to add orchestration and integration between solutions to abstract away the complexity and leverage the power of a Hybrid Data Architecture. The movement towards Hybrid Data Ecosystems especially in support of Big Data initiatives has been underway for several years. EMA research has tracked this paragigm shift via our 2012 and 2013 Big Data research studies. The 2013 findings illustrate that 60% of Big Data projects are utilzing two or three of the eight HDE platforms. 4 Page 4

2013 Hybrid Data Ecosystem Platform Distribution Two Platforms 32.1% Three Platforms 27.8% One Platform 28.2% Four Platforms 4.3% Eight Platforms 2.3% Five Platforms 3.5% Six Platforms 1.5% Over 11% of Big Data projects are relying on 4 8 individual platforms to execute on sophisticated workloads. Utilizing the best possible platform within a Hybrid Data Ecosystem creates several value propositions not generally available with traditional environments. Platform specific workloads allow the end users to align applications and to optimize their performance on the supporting platorms. A new level of agility is delivered as well, providing flexibility to how applications and work processes are delivered. Aligning to the proper platform increases performance and addresses the demands of real-time insghts and operational workloads. Allowing the system to support the speed of the business. Each Platform in a Hybrid Data Ecosystem delivers unique value and abilities. They include: Utilizing the best possible platform within a Hybrid Data Ecosystem creates several value propositions not generally available with traditional environments. Operational systems: Business support systems such as website order entry applications, Point Of Sale (POS), Customer Relationship Management (CRM) or Supply Chain Management (SCM) applications. These platforms contain increasingly fine-grained information on transactions and demographics. Enterprise data warehouse: Centralized analytical environments where corporate-level, reconciled and historical information of an organization is stored. These platforms have structured data organizations (schemas) based on time rather than present information. Data mart: Often distributed analytical environments where a particular subject area or department level data set is stored for historical or other analysis. These platforms often have similar data organization to the enterprise data warehouse, but serve smaller user groups. 5 Page 5

Analytical platforms: Specifically architected and configured environments for providing rapid response times for analytical queries. These platforms are generally developed to support high-end analysis via tuned data structures like columnar data storage or indexing. Discovery platform: Data discovery platforms support both standard SQL and programmatic API interfaces for iterative and exploratory analytics. NoSQL: NoSQL data stores use non-traditional organizational structures such as key-value, widecolumn, graph or document storage structures. These data stores support programming APIs and limited SQL variants for data access. Hadoop: A specific variant of the NoSQL platform based on the Apache Hadoop Open Source project and its associated sub-projects. These platforms are based on Hadoop s Distributed File System (HDFS) storage and the evolving MapReduce (MRv2 or YARN) processing framework. Cloud: Cloud data sources and computing platforms make information available via standardized interfaces (APIs) and bulk data transfers. Big Data in Cloud adoption is growing fast driven by lower capital costs and fast project implementations cycles. As mentioned above, Big Data 2.0 workloads are complex, generally require an element of speed, incorporate multiple data souces and rely on a variety of platforms to execute the work. 2013 EMA research identified analytic databases as the most used platform in the 600 active projects surveyed. The chart below illustrates the diversity required to meet Big Data workloads. It is interesting to see that Analytical Platforms are at the top of the list at 42% utilization and Hadoop is utilized in only 16% of the projects. 2013 Platforms Used in Big Data Ecosystem Analytical database platforms/appliances 42.1% Operational data stores 39.4% Cloud-based data solutions 39.0% Enterprise or federated data warehouse 33.6% Data marts 30.1% NoSQL data store platforms 21.6% Data Discovery platforms 18.1% Hadoop and its subprojects 16.2% Other (Please specify) 0.4% 0% 10% 20% 30% 40% Percentage Responses Selecting the platforms that are right for your needs can be confusing. The EMA Hybrid Data Ecosystem references five requirements to assist in making this decision. Structure It s critical to understand the structure of the data to be utilized and how that data will be organized. Schema flexibility is a key value to the agility you can get from a Hybrid Data Ecosystem. Exploring the structure of the data will assist you in determining the best platform. 6 Page 6

Load Most complex Big Data workloads leverage diverse data sources. The mix of data will determine the best platform as well as understanding the velocity of the data. Batch versus real-time is a critical decision point when exploring the best platform alternatives Economics Big Data is enabled by economic factors. Many of the more innovative data driven processes companies are researching would have been economically prohibitive in the past. Selecting cost effective platforms is very important when researching solutions for a hybrid environment. Unified platforms that feature multiple solutions within a single solution can positively impact the economic side of these decisions. Analytics Complexity of workload is one of the most important requirements of a platform in a Hybrid Data Ecosystem. Operational processing, operational analytics, advanced data exploration and standard analytic needs must be taken into consideration with choosing the best platforms. Response Operating at the speed of business is critical to any application or operational process. Choosing a platform that matches the necessary speed to insight is non-negotiable when creating a responsive and agile Hybrid Data Ecosystem. Orchestration and Integration Applying the requirements of a Hybrid Data Ecosystem to select the proper platforms to fit your needs is important, but at the same time building an ecosystem that is easily managed can be extremely difficult. The vendor community has recognized this gap and has started to deliver unified platforms that incorporate multiple platforms under a single solution stack. These unified offerings are highly integrated and can be more easily managed than systems that are cobbled together. These systems are adept at orchestrating Big Data workloads, operational processing, operational analytics, standard analytic workloads and many enable advanced data exploration features. EMA Perspective It s clear that a significant shift is underway in the area of Big Data. Early opportunities to leverage new data types have fostered new levels of innovation making Big Data a critical component of enterprise strategies. As the technologies evolve, mature companies will need to invest in solutions that are designed to meet these new demands. To meet present and future needs consider the following when building your strategy around Big Data. Look to unified architectures that deliver the platform functionality required while including highly orchestrated data and management features. Systems that support collaboration and reuse will save time and allow you to be more agile. It s clear that a significant shift is underway in the area of Big Data. Ensure that your vendor partners can deliver enterprise level service including domain expertise to enable greater value from your Big Data investment. Investigate your present and future needs for Big Data speed of execution. Both business and IT are struggling to meet this new Big Data 2.0 challenge. Leading platforms will go beyond these features to include automated workload management and easy embedding of Big Data into applications and workflow processes. 7 Page 7

About Actian The Actian Analytics Platform accelerates the entire analytics value chain from connecting to massive amounts of raw big data all the way to running sophisticated analytics in real-time. The entire platform is built to bring convergence to a Hybrid Data Ecosystem: Connect any data or platform for greater precision Prepare and enrich all data for increasing value Share computing and data at runtime for real-time accuracy Choose from hundreds of analytic building blocks Rapidly assemble and reuse analytic workflows Optimize response to events with lower latency Continually increase the precision of automated decisions Deliver real-time insight to anyone, anywhere The current shift to Big Data 2.0 creates an opportunity to release the $15 trillion still trapped in enterprise data. The race is on to provide affordable access to the 88% of enterprise data that has proven impractical to leverage in the past. To move forward to Big Data 2.0, six next-generation capabilities of the Actian Analytics Platform help companies accelerate and stay ahead of the curve in the fast paced Big Data market: 1. Cooperative processing delivers faster time to value and better price performance 2. Analytic building blocks provide accessibility for non-skilled and less skilled workers 3. Moving processing to where the data lives operationalizes big data and pushes toward real-time 4. Combining non-relational and relational data enables a richer set of analytics 5. Service layers abstract away the complexity of underlying infrastructure 6. A unified platform provide modular approaches for entry points anywhere along the analytic process 8 Page 8

About Enterprise Management Associates, Inc. Founded in 1996, Enterprise Management Associates (EMA) is a leading industry analyst firm that provides deep insight across the full spectrum of IT and data management technologies. EMA analysts leverage a unique combination of practical experience, insight into industry best practices, and in-depth knowledge of current and planned vendor solutions to help its clients achieve their goals. Learn more about EMA research, analysis, and consulting services for enterprise line of business users, IT professionals and IT vendors at www.enterprisemanagement.com or blogs.enterprisemanagement.com. You can also follow EMA on Twitter or Facebook. This report in whole or in part may not be duplicated, reproduced, stored in a retrieval system or retransmitted without prior written permission of Enterprise Management Associates, Inc. All opinions and estimates herein constitute our judgement as of this date and are subject to change without notice. Product names mentioned herein may be trademarks and/or registered trademarks of their respective companies. EMA and Enterprise Management Associates are trademarks of Enterprise Management Associates, Inc. in the United States and other countries. 2014 Enterprise Management Associates, Inc. All Rights Reserved. EMA, ENTERPRISE MANAGEMENT ASSOCIATES, and the mobius symbol are registered trademarks or common-law trademarks of Enterprise Management Associates, Inc. Corporate Headquarters: 1995 North 57th Court, Suite 120 Boulder, CO 80301 Phone: +1 303.543.9500 Fax: +1 303.543.7687 www.enterprisemanagement.com 2859.031014