Lake or Cesspool? The Challenges of Big Data Infrastructure

Similar documents
Performance Marketing

Data Ingestion in. Adobe Experience Platform

Customer Value Analytics for Banking & Capital Markets

The Purpose and Value of a Customer Data Platform

WHITE PAPER. Integrated customer insights

Integrated Social and Enterprise Data = Enhanced Analytics

A Forrester Consulting Thought Leadership Paper Commissioned By Google. March 2016

Architecting an Open Data Lake for the Enterprise

Customer Value Analytics for Banking & Capital Markets

A BUYER S GUIDE TO CHOOSING A MOBILE MARKETING PLATFORM

SOLVING THE MARKETING ATTRIBUTION RIDDLE Four essentials of decoding the multitouch attribution, beyond the last click.

THE RIGHT TECHNOGRAPHICS FOR B2B TECHNOLOGY MARKETERS

: Boosting Business Returns with Faster and Smarter Data Lakes

White Paper. Tracking the Omni-Channel Customer Journey

Turn Audience Suite. Introducing Audience Suite

FUELING FINANCE S NEEDS FOR INSIGHTS WITH SAP S/4HANA

15 ways Parse.ly can help increase revenue

1. Search, for finding individual or sets of documents and files

Adobe and Hadoop Integration

Customer Satisfaction Index: What s Missing in Your Net Promoter Score?

1 01. Customer Acquisition vs. Customer retention: the big challenge The role of Retention Marketing in e-commerce

How to drive customer retention in e-commerce. 7 tips to transform your online business and thrive

THE FIVE BUILDING BLOCKS OF AN EXCEPTIONAL WEB EXPERIENCE. Your guide to winning the personalization race.

Unlimited Possibilities: Relationship Marketing Using MCIF 2.0

Adobe and Hadoop Integration

Identifying the Individual Consumer Through Data Management, Data Integration and Analytics

Building campaigns that deliver.

The 11 Core Functionalities of a Successful Data Strategy

STATE OF B2B MOBILE MARKETING 2015

Avature Refer. Get Engaged to Talent V 1

Translating subscriber data into actionable intelligence. A marketing goldmine just out of reach. Media & Publishing

EVERYTHING YOU NEED TO KNOW ABOUT ACCOUNT BASED MARKETING:

4 Steps to Maximizing. Customer Lifetime

KNOWLEDGENT WHITE PAPER. Driving Insight into Patient Health Risks, Costs, and Outcomes with Big Data Analytics

Financial Services: Maximize Revenue with Better Marketing Data. Marketing Data Solutions for the Financial Services Industry

Building campaigns that deliver.

Targeting Omni-Channel Shoppers

Addressability At Scale. Craig Dempster

Experience Data Model The Rosetta Stone of the Experience Business

APRIL 18, 2018 NEW SHOPPER ENGAGEMENT SOLUTIONS SEGMENT WILL EMERGE, REACHING $38B BY 2021 WITH A 36% CAGR

Unlock the Power of Your DMP THE FOUNDATIONAL USE CASES THAT ARE UNIFYING THE WORLDS OF ADTECH AND MARTECH

Global Media Intelligence Report

CUSTOMER ANALTYICS BEST PRACTICES

How to build a profitable customer loyalty program 7 key steps to create customers for life AGENCY OF THE YEAR

MapR: Converged Data Pla3orm and Quick Start Solu;ons. Robin Fong Regional Director South East Asia

CASE STUDY 4 1. Case Study 4: Collaborative Marketing Campaign Management. Ernesto Mancia. BUS 4200: Enterprise Information Management Systems

NICE Customer Engagement Analytics - Architecture Whitepaper

How to Design a Successful Data Lake

Marketing Data Solutions for the Financial Services Industry

Data maturity model for digital advertising

INFORMATION DELIVERS. PeopleSoft Enterprise Marketing

CUSTOMER DATA PLATFORM OR MARKETING AUTOMATION WHAT DO YOU REALLY NEED?

Big Data for the Pharmaceutical Industry

Reaching Retail Customers with Individualized Insights. A brief how-to guide EB-8858 GUIDEBOOK

Microsoft Dynamics CRM: The Holistic CRM Solution

Create Experiences. Build Customers. Drive Sales.

Analytics empowering clients to see farther & go faster

With Adobe Experience Cloud, we re pushing the boundaries on storytelling to bring audiences news delivered however and wherever they want.

Deliver Contextually-Relevant, Real-Time Messages That Increase Customer Engagement

HostLogic SAP Hybris Marketing Presentation

GIGYA: Connect, Collect, Convert

Guide. Omni-Channel Order Management

Cask Data Application Platform (CDAP) Extensions

Global Media Intelligence Report

THE SITEFINITY. Helping you create winning customer experiences across all channels

Combine attribution with data onboarding to bridge the digital marketing divide

LUXURY GOODS. Acquire New Luxury Customers With Predictive Marketing CASE STUDY. A Reach Analytics Paper 2016

Digital Marketing Center to power the customer experience across , Mobile, Social, and Web

Historic impact. Royal London uses Adobe Campaign to tap into omnichannel customer data and provide seamless online and offline experiences.

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

Webtrends for Banking. Give your customers cross-channel experiences that are relevant, personal and valuable. Solution Overview

Continuous customer dialogues

Messaging Apps and Chatbots for Brand Marketing

7 Trends Impacting How We Use Digital Assets

Empowering Customer Analytics, Fraud Detection, and Threat Assessment. Whitepaper

University of Central Lancashire Success Story

EXPERIENCE BI NIRVANA

Measuring online impact on offline conversations

Big Data Platform Implementation

The Omni-Channel Data Trend Report 2017

HIGHER customer acquisition through personalized campaigns across touchpoints

PORTFOLIO AND TECHNOLOGY DIRECTION ARMISTEAD SAPP & RANDY GUARD

DATA, DATA, EVERYWHERE: HOW CAN IT BE MONETIZED?

Increasing Business Agility with Service Oriented Architecture

Welcome to Today's Web Seminar

PEGA 101 AI, AUTOMATION, AGILITY. Product and Technology Overview

Master Growth Marketing with Modern Analytics

360-degre business analytics for media companies. A Lore IO Whitepaper

TRANSFORMING THE RETAIL INDUSTRY THROUGH STRATEGIC IT

A UNIFIED VIEW OF THE CUSTOMER THE KEY TO CROSS-CHANNEL MARKETING

Get to the heart of the matter, the heart of your customer

Revealing Digital Behavior Digital Experience Analytics

Oracle WebCenter Sites

How To Evaluate SMS Marketing Solutions For Your B2C Enterprise

Session 87PD, Actuaries in the Cloud: Data Lakes to Downstream Analytics

SUITECOMMERCE Complete Commerce Solution for Delivering a Unified Customer Experience

SUITECOMMERCE Complete Commerce Solution for Delivering a Unified Customer Experience

SUITECOMMERCE Complete Commerce Solution for Delivering a Unified Customer Experience

ACXIOM DATA: Understand and Engage Consumers Everywhere by Leveraging the World s Best Data. Unparalleled Global Consumer Insights

4.3% higher mobile conversions. 115% annual ROI on Clicktale after less than two months use

Transcription:

Lake or Cesspool? The Challenges of Big Data Infrastructure Big data environments are making it quick and easy for companies to store any and all forms of audience and customer. Marketers are increasingly seeking to activate this data to power targeted marketing and personalization. The concept of a data lake has emerged as a form of big data repository to support a traditional data warehouse. A data lake stores large volumes of data in structured, semistructured, or unstructured form without the need to define a data schema up-front. This not only allows for an easy and nimble way to capture data, but also provides agile and granular access for analytics. A well-conceived data lake should empower data scientists and marketing analysts to mine insights and identify new attributes for targeted marketing or predictive modeling. This enables organizations to focus on extracting and processing only those data elements that will drive the highest business value. As a result the right data gets incorporated into the more structured data aggregates and attributes, which, in turn, power marketing execution. However, the ability to collect data from a multitude of sources at an expedient rate can create an opening for data lakes, or storage systems, to become a quagmire of information that is not easy to understand or to act upon.

Companies that are considering competitive market advantage and planning or actively engaged with big data infrastructure should ask themselves: How good is your big data in terms of quality? In big data environments, the ingestion of data is often unqualified. This leaves much of the due diligence to users, who may spend 90 percent of their time on data cleansing and other preparation, instead of data analysis. Data lakes that provide nimble access for the data scientist to explore, should not, in turn compromise on data quality for business applications. High-quality data is required for reliable reporting, accurate modeling, campaign execution, and other production marketing capabilities. In data lakes, data may be raw or unstructured, but it should still be as accurate and complete as possible. Businesses need to consider how they are managing their data and what gate-keeping methods need to be applied to keep their data lakes clean and easy to utilize in a production or end-user environment. DATA QUALITY EXAMPLE Overview: Web analytics hit level data is ingested into the data lake through an automated process. Events and custom variable meanings have been changed by the web analytics team without notifying other areas of the business including the central analytics group that uses web analytics data within the data lake for modeling. Problem: Web behavioral attributes used in the modeling are no longer accurate. Solution: Institute process for communicating changes that affect data structure or meaning downstream to all groups that consume data. Solution: Create automated data quality processes which alert data administrators when distribution of data values changes significantly. Do you have the necessary taxonomy and cross-reference data to analyze and aggregate data from the data lake? Querying and analyzing a wide range of data requires the effort to define and classify the data elements within. There is little value in maintaining an environment to house and query big data if much of the supporting descriptive elements needed to analyze, classify, and aggregate that data cannot be accessed within the lake itself. It is important to have a disciplined taxonomy for descriptive data elements that is consistent across the company (e.g., individual and household identifiers, customer segmentation, web page categories, campaign metadata, etc.). This discipline will allow the analyst to more easily and effectively prepare, analyze, and aggregate data within the lake, and then effectively integrate the data across systems. CROSS-REFERENCE EXAMPLE Overview: Mobile app usage event data is available in the data lake. The data can be analyzed and aggregated by customer ID, but customer (CRM) data is not available in the data lake including customer subscription status, customer segment, and customer value. The business wants insight into which customers are the heaviest app users. The business also wants to be able to leverage app usage to target marketing messages. ` Problem: Mobile app usage data cannot be joined to customer attributes within the data lake to analyze usage. In addition, the mobile app data is too large to feasibly import into the data warehouse for analysis. Problem: Customer subscription status and segment data is not available to extract the relevant data needed to target marketing messages. Solution: Integrate customer attributes into the data lake so they can be used in analysis or data selection.

DATA TAXONOMY EXAMPLE Overview: Display view and click data is kept in the data lake. A campaign identifier is available in the display data to identify the particular display marketing activity. The business wants to analyze the performance of display marketing at the campaign group level. Campaign metadata is not maintained within the data lake. Problem: Because the display view and click data are isolated from the campaign metadata, the campaign performance cannot be aggregated to the campaign group level. Solution: Integrate campaign metadata into the data lake. Overview: Web analytics data is maintained in the data lake. Business users want to report on the number of form submissions. There are different ways that form submissions can be aggregated from the event-level data, (e.g., counting total events, counting unique events by visit or visitor, counting events within certain time- or campaign-based attribution rules, etc.). Problem: Since a standard has not been established by the business analytics group on how to define aggregated form submissions, each analyst aggregates the data in a slightly different way, creating inconsistency and a lack of trust from the business. Solution: Establish data aggregation and reporting standards. Define aggregate metrics across the organization. Solution: Create aggregated summary tables or views of the data. How easy is it to integrate your big data with other important (traditional) data environments? Big Data is often tied to specific customer touchpoints and actions (e.g., transactions and web log activity). In order for it to be meaningful and actionable, t his data needs to integrate with other customer and campaign data housed in the data warehouse and other marketing or analytics platforms (e.g., campaign and content management, transactional systems, and modeling platforms). It is important to build a big data environment with serious consideration toward how it will integrate across other datasets. For example, common keys/identifiers must be accessible across datasets, and consistent methods must be applied for identification management. This will drive better associations back to customers and ensure a common enterprise customer definition. ABILITY TO INTEGRATE DATA WITH OTHER DATA ENVIRONMENTS EXAMPLE Overview: Web activity data, associated with a visitor ID, is maintained in the data lake. Sometimes when visitors log into their account a web profile ID is captured in the web activity data. The association between the profile ID and customer ID is not available within the data lake. Problem: Web activity data cannot be tied back to customers, therefore it provides little value to the CRM program. Solution: Bring the profile ID to customer ID cross-reference file into the data lake. Use identity management solution to associate visitor IDs to customer IDs and retroactively match data for anonymous visitors who log into their account at a future date. How does big data fit into your overall data strategy? All of the above points add up to the fact that big data and data lakes do not operate in isolation. How will marketing and analytics teams leverage the data? How will the data drive performance or customer insight? How will that data eventually be activated to power targeted marketing, predictive modeling, or personalization? A comprehensive data strategy must be developed in order to understand the objectives and requirements of the big data infrastructure before it is built and activated. Technology, marketing, and analytics user groups need to be included in the discussions that may require education about what big data is and how it could apply to their work.

HOW BIG DATA FITS INTO DATA STRATEGY EXAMPLE Overview: Analysts discover a strong correlation between certain online web activities and future purchase activity within the data lake. Marketers want to use these online web activities to trigger marketing messages. IT returns with a plan to aggregate and import the web activity data into the CRM data store they estimate several weeks to do the data development necessary to incorporate the data into CRM system. Problem: An effective data strategy is not in place to quickly transform insights from big data into marketing execution. Solution: Establish a process and infrastructure that allows agile data communication from your data lake into your marketing execution platforms and vice versa. Solution: A complete data strategy will encompass the full lifecycle of data from collection to storage to consumption by analysts and marketing platforms. Do we have a governance process in place to maintain the integrity of the data lake over time? Setting up the data lake is only part of the story. Over time, people can be less sensitive to the quality and format of the data they choose to pump into the lake, or business needs and usage can change. You must have a governance process in place to periodically (1-2x per year) review the data quality in the lake to make sure it is clean and consistent and that the data being captured is still valuable to those using it for reporting or analysis. If a data source being fed into the lake is no longer used by the business, it is time to turn off that source. Minimizing extraneous or imperfect data flowing into the data lake will reduce maintenance costs and make it easier for analysts to find and use the right data. DATA GOVERNANCE EXAMPLE Overview: Data columns, tables, or sources within a data warehouse or lake may become deprecated as new data is introduced. Old data sources may persist to maintain processes during transition periods, and then they are never cleaned up, or the effort is never invested to fully transition from legacy data sources to new data sources. Over time, your data warehouse or data lake will become cluttered and difficult to use, often requiring extensive internal knowledge to be able to identify the best data sources to use for a particular task. Problem: Ineffective data governance to maintain integrity of data sources and to fully transition systems from old data sources to new data sources. Solution: Data governance needs to become a priority within the organization. Businesses should realize that investment in data governance is reflected in lower costs on future data projects, lower analyst training costs, and more consistency/less likelihood for error in reporting and data-driven marketing. In the end, a data lake should not be a data dumping ground or simply an access point for data scientists to mine insights from log data. A data lake should be seen as an environment for developing data-powered marketing execution. To effectively support marketing execution, the data must be of high quality, be descriptive and supported by a consistent taxonomy, be able to integrate across other data environments and marketing systems, and be guided by a comprehensive data strategy. Maintaining a clean and effective data lake requires planning and governance; however, the ease and agility of storing data in the lake can lead to laziness and lack of effort in getting the right data or enforcing data quality. This is how data lakes can become polluted over time. If you are not driving value from your big data; if your data scientists and analysts are spending all of their time trying to clean, prepare, or connect data; or if they don t even want to touch the stuff, then your data lake may have transformed into a murky cesspool.

ABOUT THE AUTHORS Robert Schroko Principal, Marketing Solutions Robert is a seasoned consultant/executive with a track record for shaping customer development and engagement strategies for major brand companies (Condé Nast, Saks Fifth Avenue, SONY Music) across digital and traditional media platforms. He has expertise in driving deeper and more profitable customer relationships across all phases of the customer lifecycle by leveraging transaction, behavioral, and big data analytics and applying it to web, tablet, mobile, store, and traditional media. Robert has utilized his combination of analytical and business acumen to implement customer acquisition, development and retention strategies through predictive modeling; implement and optimize loyalty programs; and create in-store customer-centric clienteling solutions. His talents also included developing customer 360 databases, consolidating individual consumer behavior and transaction data collected from desktop, tablet, mobile (all via Omniture), and offline media. Robert hold an MS and a BS degree in industrial and management engineering from Rensselaer Polytechnic Institute. He is a frequent guest speaker at industry and academic events, which have included the University of Pennsylvania Wharton MBA Executive Program, Teradata Partners Conference, and The Conference Board Marketing Conference. Peter Kemp Principal, Customer Strategy Peter has more than twenty years of strategy, marketing, and CRM consulting experience in retail, consumer products and healthcare. At Merkle, he is a leader in developing the strategies, processes, organizations, and change management plans needed for clients to become more customer centric. Peter began his career in brand management at Kraft and Black & Decker. He then worked for Accenture in the consulting firm s marketing strategy practice. After Accenture, he joined DDB Advertising s CRM group where he was the client lead for Lowes Home Improvement, and then served as the SVP Global CRM lead for ExxonMobil, overseeing all CRM and loyalty programs for it s retail operations worldwide. Peter received a BS in marketing from the University of Virginia and an MBA in finance from the Wharton School at the University of Pennsylvania, where he was also a top-ranked instructor in the undergraduate marketing program. Allen Dickson Director, Analytics Allen has over ten years of experience in digital marketing and customer analytics, with deep expertise in digital data and personalization solutions. Since joining Merkle, Allen has worked across many client accounts including Walmart, Dell, Samsung, Abercrombie & Fitch, Under Armour, Lowe s, Office Depot, Bose, and others, on a variety of digital analytics projects, from digital data capture and integration to personalization strategy and execution. Allen also led the development of Merkle s product recommendation engine. Prior to Merkle, Allen worked for Overstock.com where he led it s email marketing, web analytics, website testing and optimization, and personalization efforts. Allen received an MS in mathematics from Brigham Young University (with a concentration in algebraic topology) and also studied math at the University of Utah under an NSF research fellowship.

ABOUT MERKLE Merkle is a leading data-driven, technology-enabled, global performance marketing agency that specializes in the delivery of unique, personalized customer experiences across platforms and devices. For more than 25 years, Fortune 1000 companies and leading nonprofit organizations have partnered with Merkle to maximize the value of their customer portfolios. The agency s heritage in data, technology, and analytics forms the foundation for its unmatched skills in understanding consumer insights that drive people-based marketing strategies. When combined with its strength in performance media, Merkle creates customer experiences that drive improved marketing results and shareholder value. With more than 3,700 employees, Merkle is headquartered in Columbia, Maryland with 16 additional offices in the US and offices in Barcelona, London, Shanghai, and Nanjing. In 2016, the agency joined the Dentsu Aegis Network. For more information, contact Merkle at 1-877-9-Merkle or visit www.merkleinc.com.