Cloud Integration and the Big Data Journey - Common Use-Case Patterns

Similar documents
Cognitive Data Warehouse and Analytics

Boston Azure Cloud User Group. a journey of a thousand miles begins with a single step

DATASHEET. Tarams Business Intelligence. Services Data sheet

GET MORE VALUE OUT OF BIG DATA

How to Build Your Data Ecosystem with Tableau on AWS

Analytics in the Cloud, Cross Functional Teams, and Apache Hadoop is not a Thing Ryan Packer, Bank of New Zealand

Common Customer Use Cases in FSI

Meta-Managed Data Exploration Framework and Architecture

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

IBM Analytics. Data science is a team sport. Do you have the skills to be a team player?

Applying Analytics with Big Data for Customer Intelligence. Seven Steps to Success

CONNECTING THE DOTS FOR BETTER INSIGHT.

5th Annual. Cloudera, Inc. All rights reserved.

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

Louis Bodine IBM STG WW BAO Tiger Team Leader

Astera Data Warehouse Accelerator

Business is being transformed by three trends

VIA Insights: Telcoms CONNECT to Digital Operations

Managing explosion of data. Cloudera, Inc. All rights reserved.

A complete service guide for MICROSOFT DATA ANALYTICS ENABLEMENT

Microsoft Big Data. Solution Brief

DLT AnalyticsStack. Powering big data, analytics and data science strategies for government agencies

Enabling Self-Service BI Success: TimeXtender s Discovery Hub Bridges the Gap Between Business and IT

Microsoft Dynamics 365 and Columbus

Adobe and Hadoop Integration

Architecting for Real- Time Big Data Analytics. Robert Winters

Angat Pinoy. Angat Negosyo. Angat Pilipinas.

Modern Analytics Architecture

Hadoop Course Content

Adobe and Hadoop Integration

DataAdapt Active Insight

KnowledgeSTUDIO. Advanced Modeling for Better Decisions. Data Preparation, Data Profiling and Exploration

Analytics With Hadoop. SAS and Cloudera Starter Services: Visual Analytics and Visual Statistics

4/26. Analytics Strategy

Government Business Intelligence

Big Data Solution for Church & Dwight. Get the most out of your complex business data with Big Data and business intelligence

Table of Contents. Are You Ready for Digital Transformation? page 04. Take Advantage of This Big Data Opportunity with Cisco and Hortonworks page 06

Contents at a Glance COPYRIGHTED MATERIAL. Introduction... 1 Part I: Getting Started with Big Data... 7

Architected Blended Big Data With Pentaho. A Solution Brief

SIMPLIFYING BUSINESS ANALYTICS FOR COMPLEX DATA. Davidi Boyarski, Channel Manager

Big Data The Big Story

Blueprints for Big Data Success. Succeeding with four common scenarios

Enterprise Architecture for Digital Business

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia


IBM Digital Analytics Accelerator

Blueprints for Big Data Success

Analytics empowering clients to see farther & go faster

H2O Powers Intelligent Product Recommendation Engine at Transamerica. Case Study

THE MAGIC OF DATA INTEGRATION IN THE ENTERPRISE WITH TIPS AND TRICKS

Fast Innovation requires Fast IT

Alexander Klein. ETL meets Azure

PORTFOLIO AND TECHNOLOGY DIRECTION ARMISTEAD SAPP & RANDY GUARD

In search of the Holy Grail?

BOOK EXTRACT. Data to Diamonds Delivering valuable business insights

Informatica Cloud Application Integration

Integrating Enterprise Applications with MongoDB to Deliver Performance and Scalability

BIG Data Analytics AWS Training

Hybrid Data Management

Going Big Data? You Need A Cloud Strategy

Jason Virtue Business Intelligence Technical Professional

Simplifying the Process of Uploading and Extracting Data from Apache Hadoop

BIG DATA TRANSFORMS BUSINESS. The EMC Big Data Solution

Ray M Sugiarto MAPR Champion Indonesia

Seminar report E-Intelligence Submitted in partial fulfillment of the requirement for the award of degree Of MCA

LEVERAGING DATA ANALYTICS TO GAIN COMPETITIVE ADVANTAGE IN YOUR INDUSTRY

Datametica DAMA. The Modern Data Platform Enterprise Data Hub Implementations. What is happening with Hadoop Why is workload moving to Cloud

WELCOME TO. Cloud Data Services: The Art of the Possible

Oracle Business Intelligence Applications

Architecting an Open Data Lake for the Enterprise

INTELLIGENT CUSTOMER EXPERIENCE ( ICE ) WITH PROVEN VALUE AND ROI

Enterprise Mobility Native Mobile Apps that Transform Business Processes and Boost Productivity

zdata Solutions BI / Advanced Analytic Platform and Pilot Programs

Analytic Workloads on Oracle and ParAccel

Introduction Where is RPA/BPM today? Use Case Managing risk Getting RPA right Conclusion

Confidential

Vertical Edge Consulting Group

Innovation and Competitive Differentiation with Data Dynamics

Scenarios for Oracle BI and Analytics in the Cloud E-BOOK

From Information to Insight: The Big Value of Big Data. Faire Ann Co Marketing Manager, Information Management Software, ASEAN

Data Ingestion in. Adobe Experience Platform

Big Data Management Best Practices for Data Lakes Philip Russom, Ph.D.

Build a Future-Ready Enterprise With NTT DATA Modernization Services

Next Generation Services for Digital Transformation: An Enterprise Guide for Prioritization

EXECUTIVE BRIEF. Successful Data Warehouse Approaches to Meet Today s Analytics Demands. In this Paper

POWER REAL-TIME TELCO NETWORK OPERATIONS WITH EXTREME ANALYTICS

Who is Databricks? Today, hundreds of organizations around the world use Databricks to build and power their production Spark applications.

Big and Fast Data: The Path To New Business Value

INDUSTRY BRIEF THE ENTERPRISE DATA HUB IN FINANCIAL SERVICES: THREE CUSTOMER CASE STUDIES

The Industry Leader in Data Warehousing, Big Data Analytics, and Marketing Solutions

Introducing Amazon Kinesis Managed Service for Real-time Big Data Processing

Microsoft BI Product Suite

InfoSphere Warehouse. Flexible. Reliable. Simple. IBM Software Group

Azure Data Analytics & Machine Learning Seminar. Daire Cunningham: BI Practice Area Manager

In-Memory Analytics: Get Faster, Better Insights from Big Data

Big Data Analytics for Retail with Apache Hadoop. A Hortonworks and Microsoft White Paper

Nouvelle Génération de l infrastructure Data Warehouse et d Analyses

SAP BusinessObjects Business Intelligence

Solving key business challenges with a Big Data Lake

PI System. & the Greater Technology Landscape. Penny Gunterman, PhD Product Marketing Manager. #OSIsoftUC #PIWorld 2018 OSIsoft, LLC

Transcription:

Cloud Integration and the Big Data Journey - Common Use-Case Patterns A White Paper August, 2014 Corporate Technologies Business Intelligence Group OVERVIEW The advent of cloud and hybrid architectures have enabled clients to rapidly stand up technology stacks that traditionally required specialized expertise and long times. Big Data, an umbrella term encompassing ingestion, processing, and analytics around structured and semi-structured data sets, has been revolutionary for the data warehousing and analytics market. These data sets, including data from cloud-based solutions, sensors, and Internet-enabled devices, are often large and diffcult to process using standard relational data warehousing methodologies. Big Data solutions take an alternative way to process these data sets by leveraging both cloud-based and non-relational technologies to derive analytical value. One significant customer problem with Big Data involves the rapidly changing technology stacks and specialized code that is required to work effectively in the space. Companies are reluctant to invest too deeply in one technology for fear of this rapid change. However, many innovative applications leverage Big Data to improve customer satisfaction, reduce operational risk, and increase sell through. Customers want the benefit of Big Data, but often do not know how much of an investment is required to begin. To that end, we will review three specific customer use case patterns in detail within in this white paper. These use cases discuss both cloud architectures and Big Data solutions in detail, and show how to remove complexity, reduce operational risk, and improve customer satisfaction. This technical brief is intended to be a companion to The Big Data Journey webinar. The author would like to extend his thanks to John Haddad of Informatica, who provided some of the architecture slides within this white paper. WWW.CPTECH.COM 781.273.4100 IT SERVICES

USE CASE 1: REMOVE COMPLEXITY A pharmaceutical client uses several cloud-based applications for sales force and operations enablement. These applications allow business analysts to rapidly provision new functionality on the fly. However, IT must also possess the agility to rapidly and continuously provision and integrate cloud-based application data while maintaining existing data warehouse integrity and data lineage. We leverage cloud services solutions like Informatica Cloud Services (ICS) to help address this challenge. At our pharmaceutical client, the sales team manages multiple new products and adds new SFDC columns at the rate of one or two a week. The existing ETL process had to replicate data from SFDC down to the main enterprise data warehouse (EDW). Each new column required a corresponding ETL change and update to jobs, causing significant IT development churn. We leveraged ICS s SFDC replication solution to mirror each SFDC table into a staging environment within the DW. The ICS workflow is managed through a web-based interface, which is available to the same business analyst that adds fields to SFDC. If a new column has been added to SFDC, the analyst logs into ICS and quickly configures, in less than 5 minutes, the new column to be replicated to the DW. In the diagram above, the green databases represent existing SQL Server databases that were not impacted by the switch in replication architecture. We simply removed the existing ETL code feeding the SQL 2008 Replication Stage target, and replaced it with an ICS endpoint Once replicated to a DW staging environment, the SFDC tables are wrapped with views to create a dimensional analytical layer. This layer is immediately available to trained business analysts using BI and visualization tools to perform data analysis. Insights from these analyses are vetted and implemented by the DW team and then turned into operational reporting in the enterprise BI environment on a weekly basis. Leveraging ICS and the staging replication architecture has allowed us to significantly accelerate time to market within the DW for simple SFDC column additions. The DW team is freed from regularly working on lights-on management tasks, and business analysts can immediately perform analysis without having to wait for new ETL development. 2

USE CASE 2: REDUCE OPERATIONAL RISK An oil and gas client was looking to understand performance and maintenance activity around their wells. Specifically, the oil and gas industry uses a large amount of sensors to monitor well activity. These sensors measure pressure, level and flow rates, and are prevalent within the industry. They come with operational monitoring solutions that allow technicians to spot up-to-the-second deviations and apply corrective action. Maintenance, as you can imagine, is critical for both production and safety, and often the earlier a problem is caught, the cheaper it is to fix. Our client was very interested in knowing about maintenance issues as soon as possible, and ideally, applying preventative maintenance to prevent a larger issue. In order to apply more intelligence towards preventative maintenance, the customer wanted to load sensor data to an existing data warehousing solution. However, when existing ETL infrastructure was leveraged to stream sensor data directly into the warehouse, we quickly encountered performance issues around the sheer volume of data that was being sourced. If you think about it, sensors report readings at a real time level. With multiple sensors a well, the volume of data easily eclipsed hundreds of gigabytes a day for the client s production wells. This created a serious problem with both performance and the expense of storing the data. Upon further analysis, we realized that we needed to do two things with the full array of sensor data. We were looking to apply algorithms to spot deviations in time series data, specifically deviations that went above a certain threshold for a period of time. These deviations may change, based on measurements of multiple sensor arrays. In short, we were attempting to apply matrix algebra to the existing series of sensor data. Once the deviations were spotted, we wanted to provide time-bounded series of this data to the BI environment for reporting and simple analysis. The combination of these two requirements allowed us to introduce a Big Data approach into our overall solution pattern in order to perform ELT pre-processing of this data, by applying matrix algebra using to the large volumes of sensor data. This sensor data also resembled JSON data structures in nature, and was more suitable for a Big Data solution, specifically Hadoop. We leveraged Hadoop to filter the data, apply matrix algebra to look for anomalies, and roughly model the filter records for data warehouse ingestion, by transforming the sensor records from JSON to a relational structure. You can leverage Informatica s PowerCenter Big Data Edition ( BDE ) in order to ease the processing of both JSON records (PowerCenter BDE comes with JSON support) and connectivity to Big Data solutions such as Hadoop and other NoSQL databases. In addition, PowerCenter BDE allows you to run workflows in Hadoop without having to program and interact with MapReduce in languages such as Pig and Hive. Although these languages are powerful, they require a specialized skillset that is typically different than relational and ETL skillsets already present within your DW / IT organization. 3

PowerCenter BDE allows you to leverage Big Data solutions within your EDW environment, while leveraging existing skillsets. This solution allowed us to significantly reduce load time and space consumed for the EDW. More importantly, the customer was able to spend much less time, by almost 90%, to find maintenance issues. The majority of this savings was in the time spent ingesting and processing the data, and operational expense and load on the data warehouse. USE CASE 3: IMPROVE CUSTOMER SATISFACTION An online retail company has been selling to customers over the internet for many years, and has accumulated a large data warehouse on customer activity during that time. The retailer is now interested in linking social media elements and real time customer website navigation into their selling strategy, due to significant user adoption in shopping via mobile. This likely comes with no surprise to many readers of this white paper. During the last five years, mobile shopping has become mainstream and dominant in some sectors such as books and electronics. However, the retailer also discovered that mobile customers have a higher rate of shopping cart abandonment compared to traditional laptop browser customers. For various reasons and distractions, mobile customers are leaving more shopping carts; even a small conversion on these abandoned carts would result in a significant revenue rise for our retailer. In order to get more mobile conversions, our retailer wanted to provide a more personalized shopping experience to mobile customers, by dynamically modifying content as the customer interacts with the site. The content would present both products of higher interest, as well as potentially offer aggressive pricing on selected items for certain shopping cart mixes. We leveraged a Big Data / DW solution pattern in two ways: via a NoSQL database to 1) crunch weblogs in real time, and 2) analyze a customer s Twitter stream, in order to provide items of interest and potential discounting. All of this was linked to a historical customer score made available by the traditional data warehouse via a web service. Again, you can leverage Informatica s PowerCenter Big Data Edition ( BDE ) in order to ease the processing of weblogs and connectivity to Big Data solutions. In addition, you can leverage the Social Media Connector to connect directly to a customer s Twitter stream to source that data into the NoSQL database for further analysis. PowerCenter BDE allows you to leverage Big Data solutions within your EDW environment, while leveraging existing skillsets. The retailer chose to deploy this solution in phases, initially for a select tier of customers, as an A/B test. After a month of trial, the select customers demonstrated a material difference in shopping cart losses compared to a baseline customer group -- around 10%. Customers saw things that they were more likely to buy, both in things that are aligned to their likes, and more aggressive pricing in order to get that buy. 4

ABOUT CORPORATE TECHNOLOGIES Corporate Technologies provides high value services to clients. Through the effective application of technologies like Business Intelligence, Data Integration and Management, Enterprise and Cloud Computing, we help clients implement the right IT solutions to empower business innovation and dynamic scalability. From leveraging business intelligence to rethinking the effciency of the data center, we are your strategic partner for everything from data management to information delivery. Today s IT solutions have to be highly integrated to solve the complex business challenges that organizations face. Your business cannot afford to work with multiple consulting organizations specializing in silos of experience. Corporate Technologies engineering team understands how the implementation of any new technology must support both the business and infrastructure requirements. Our ability to successfully integrate Business Intelligence, Data Management and Systems Technologies by merging complex system and application structures is a rarity in the industry. We focus on solving complex business challenges. We create long term relationships with our clients and partners to deliver recommendations and innovative, high quality, high value IT solutions. Please visit the our website at www.cptech.com, contact us by email or call us at 781-273-4100. 5