THE IMPORTANCE OF END USER DATA PREPARATION
Welcome! Dan Potter Chief Marketing Officer Howard Dresner Chief Research Officer 2 June 28, 2016
Agenda 3 June 28, 2016
Dresner Advisory Services End User Data Preparation 2016 EndUserDataPrep.report
End User Data Preparation Defined End User Data Preparation is a "self-service" capability for end users to model, prepare, and combine data prior to analysis. This may complement traditional IT-driven Data Quality/ETL processes or may be used independently.
End User Data Preparation Overview Importance of end user data preparation Current user scenarios User requirements Vendor ratings
Technologies and Initiatives Strategic to Business Intelligence 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Dashboards End-user "self service" Data warehousing Advanced visualization Integration with operational processes Data discovery Enterprise planning/budgeting Data mining, advanced algorithms, predictive Embedded BI (contained within an application, portal, etc.) Mobile device support End-user User data "blending" Data Preparation (data mashups) Location intelligence/analytics Collaborative support for group-based analysis In-memory analysis Software-as-a-service and cloud computing Search-based interface Pre-packaged vertical/functional analytical applications Big data (e.g., Hadoop) Ability to write to transactional applications Text analytics Social media analysis (Social BI) Open source software Complex event processing (CEP) Internet of Things (IoT) Cognitive BI (e.g., artificial intelligence-based BI) Critical Very important Important Somewhat important Not important
Importance of End User Data Preparation
Importance of End-User Data Preparation Not important, 5.2% Somewhat important, 14.1% Critical, 37.2% Important, 17.6% Very important, 25.9%
100% Importance of End-User Data Preparation by Function 5 90% 4.5 80% 4 70% 3.5 60% 50% 40% 3 2.5 2 Critical Very important Important Somewhat important Not important Mean 30% 1.5 20% 1 10% 0.5 0% Research and Development (R&D) Information Technology (IT) Sales and Marketing Executive Management Finance 0
100% Importance of End-User Data Preparation by Organization Size 5 90% 4.5 80% 4 70% 3.5 60% 50% 40% 3 2.5 2 Critical Very important Important Somewhat important Not important Mean 30% 1.5 20% 1 10% 0.5 0% 1-100 101-1000 1001-5000 More than 5000 0
Frequency of End User Data Preparation
Frequency of End-User Data Preparation Never, 4.4% Rarely, 8.0% Constantly, 22.7% Occasionally, 23.0% 65% Frequently, 41.9%
100% Frequency of End-User Data Preparation by Function 5 90% 4.5 80% 4 70% 3.5 60% 50% 40% 3 2.5 2 Constantly Frequently Occasionally Rarely Never Mean 30% 1.5 20% 1 10% 0.5 0% Executive Management Finance Information Technology (IT) Research and Development (R&D) Sales and Marketing 0
100% Frequency of End-User Data Preparation by Organization Size 5 90% 4.5 80% 4 70% 3.5 60% 50% 40% 3 2.5 2 Constantly Frequently Occasionally Rarely Never Mean 30% 1.5 20% 1 10% 0.5 0% 1-100 101-1000 1001-5000 More than 5000 0
Effectiveness of Current Approaches
Effectiveness of Current Approach to End-User Data Preparation 35% Totally ineffective, 8.2% Highly effective, 12.6% Somewhat ineffective, 26.5% Somewhat effective, 52.6%
100% Effectiveness of Current Approach to End-User Data Preparation by Organization Size 5 90% 4.5 80% 4 70% 3.5 60% 50% 40% 3 2.5 2 Highly effective Somewhat effective Somewhat ineffective Totally ineffective Mean 30% 1.5 20% 1 10% 0.5 0% 1-100 101-1000 1001-5000 More than 5000 0
60.00% Effectiveness of Current Approach to End-User Data Preparation 2015 to 2016 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% Highly effective Somewhat effective Somewhat ineffective Totally ineffective 2015 2016
End User Data Prep Features
End-User Data Preparation Usability Features 0% 20% 40% 60% 80% 100% Visual interface for users to view and explore inprocess data sets, interactively profile and refine data transformations prior to execution Automated detection of anomalies, outliers, & duplicates Technical expertise/programming is *NOT* required to build/execute data transformation scripts Visual highlighting of relationships between columns, attributes & datasets Automatically generate data transformation code/scripts for execution Support for entire data transformation process in a single application/user interface Automated recommendations for data relationships & keys for combining data across multiple data sets and sources Critical Very important Important Somewhat important Not important
End-User Data Preparation Integration Features 0% 20% 40% 60% 80% 100% Access to file formats (e.g.,log files, CSV, Excel) Access to traditional databases (e.g.,rdbms) Ability to combine data across multiple data sets and sources through joins and merging data Ability to infer metadata by introspecting the data elements Access to Bigdata (e.g., Hadoop) Critical Very important Important Somewhat important Not important
End-User Data Preparation Manipulation Features 0% 20% 40% 60% 80% 100% Ability to aggregate & group data Simple interface for imposing structure on raw data Ability to pivot (convert table to matrix) & reshape (convert matrix to table) data Ability to derive new data features from existing data (text extraction, math expressions, date expressions, etc.) Ability to normalize, standardize & enrich data Support for cutting, merging & replacing of values Ability to manipulate the order of data transformation steps Ability to unnest data (e.g. json/xml parsing) Critical Very important Important Somewhat important Not important
100% End-User Data Preparation Supported Outputs 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Files (e.g., Excel, CSV) Database (e.g., MySQL, MongoDB) Popular (third-party) business intelligence tool formats
End-User Data Preparation Deployment Features 0% 20% 40% 60% 80% 100% Ability to schedule the execution/replay of data transformation processing Ability to monitor ongoing data transformation processing to alert on anomalies or changes in the structure Ability to iteratively sample data to provide an interactive testing of transformation logic Push-down processing of data transformations into the native data source for script execution (SQL, Pig, etc) Critical Very important Important Somewhat important Not important
60% Location of End-User Data Preparation Capabilities 2015 to 2016 50% 40% 30% 20% 10% 0% On-Premises Cloud-based Both 2015 2016
End User Data Preparation Vendor Ratings
End User Data Preparation Vendor Ratings 2016 Jinfonet Tableau Trifacta 64 Paxata RapidMiner SAS 32 Datawatch Panorama 16 ClearStory Data TIBCO 8 Platfora Oracle Infor 4 2 1 Microsoft Datameer MicroStrategy OpenText (Actuate) Dundas Qlik Birst Domo Information Builders Informatica InetSoft Logi Analytics Dimensional Insight Pentaho (Hitachi) Sisense SAP Usability score Integration score Output score Data Manipulation score Deployment score Overall score
Conclusions Huge need for improved data prep; users wasting time valuable time using whatever tool is at hand End user data preparation empowers end users; supporting self service initiatives Key technology selection criteria: Usability (visualization, automation), manipulation and integration.
Dresner Advisory Services End User Data Preparation 2016 EndUserDataPrep.report
Agenda 31 June 28, 2016
The Datawatch Promise 1. You will spend more time analyzing not preparing data 2. You will access all of the important and trusted data you need 32 June 28, 2016
Embedded and Resold by Industry Leaders IBM selected Datawatch for: IBM Watson Analytics IBM Cognos Analytics IBM Content Manager on Demand Dell selected Datawatch for: Dell Statistica for Predictive Analytics Dell Edge Gateway for IoT 33 June 28, 2016
40,000 Organizations Use Datawatch Financial Services Healthcare Government Retail Other Industries 34 19 June 28, 2016
Why 40,000 Organizations Use Datawatch? SPEND LESS TIME PREPARING DATA Analysts waste up to 80% of their time preparing data versus analyzing $22,000 per year, per analyst wasted 35 June 28, 2016
Why 40,000 Organizations Use Datawatch? 12% Relational, Excel, Hadoop, Salesforce, etc. Only 12% of enterprise data is used to make decisions 36 June 28, 2016
Why 40,000 Organizations Use Datawatch? 12% Relational, Excel, Hadoop, Salesforce, etc. 88% Reports, Web Pages, JSON, Log Files, etc. Make Better Decisions 37 June 28, 2016
38 June 28, 2016
Datawatch Platform ACQUIRE Any data including multi-structured Remove risk of non-managed data GOVERN DISCOVER PREPARE Simply powerful for business users Proven scalability to thousands AUTOMATE 39 June 28, 2016
Utilized Across the Organization Analytics Analysts - Financial Analysts - Marketing Analysts - Sales Analysts - Operations Analysts Use Cases - Visualization & Discovery - Dashboards - Advanced Analytics Operations Broad Array of Users - Finance - Supply Chain - Risk - Sales & Marketing Use Cases - Reconciliation - Auditing - Compliance - Consolidation 40 June 28, 2016
Data Preparation Automation Service Content Repository Governance and Control Connectivity Report Mining A P I 41 June 28, 2016
Recommended Next Steps Read full Market Study Try Datawatch Attend weekly product demo 42 June 28, 2016
Thank You! Dan Potter Chief Marketing Officer Datawatch Howard Dresner Chief Research Officer Dresner Advisory Services 43 June 28, 2016