Advancing your Big Data Strategy

Size: px
Start display at page:

Download "Advancing your Big Data Strategy"

Transcription

1

2 Welcome

3 # T C 1 8 Advancing your Big Data Strategy Robbin Cottiss Strategic Customer Consultant Tableau Vindy Krishnan Senior Product Manager Tableau

4 You Know Me

5 And Me DATA TABLEAU AND

6 Audience Poll How many of you are new to Tableau? How many of you are new to Big Data? How many of you have or are moving their Big Data Platform to the cloud?

7 Our Goals for Today Re-introduce Big Data and why it might be just Data Understand how the Tableau Platform Integrates with Modern Data Platforms Extract some patterns and generalize to provide a reference framework Evaluate successful customer strategies and architectures Demo the framework and recap The Goal of Goals: Develop a Platform Mindset

8 What s the Situation?

9 Converging Waves Data Cloud

10 Converging Waves Data Cloud

11 Data Cloud Modern Analytics Platform Tablea u

12 Why Do We Need a Modern Data Architecture?

13 Security & Compliance The Tableau Platform Desktop Browser Mobile Embedded Collaboration Alerting Subscriptions Storytelling Sharing Discussions Analytics Content Discovery Governance Data Prep Visual Ad-hoc Advanced Spatial Calculations Statistics Projects Recommendations Versioning Search Centralized Data Sources Certification Usage Analysis Permissions Data Blending Query Federation Visual Data Prep Auto Data Modeling Extensibility & APIs Data Access Live In-memory Hybrid Connectivity Deployment ON-PREMISES CLOUD HOSTED WINDOWS LINUX MAC MULTI-TENANT

14 Cloud Platform

15 Big Data is just DATA!

16 3Vs of DATA Coexist with the 3Vs of Use Cases

17 Principles of a Modern Data Architecture

18 Key Principles Support multiple outcomes, applications and personas Support multiple sources of data Support multiple tiers of data Decouple storage from compute Create an agile data pipeline

19 Support Multiple Outcomes, Applications and Personas

20 Personas Business Users Data Analysts Data Scientists

21 Support Multiple Sources of Data

22 Types of Data

23

24 Support Multiple Tiers of Data

25 Data Size Data Lakes Store Everything and Anything Unknown Questions with Unknown Answers Unstructured/Data Mining/ Data Science Data Warehouses Analytical Queries Known Questions with Unknown Answers Regularly Refreshed Business Concepts Analytical Databases Precomputed Aggregates Known Questions and Known Answers Analytical Dashboard Already Constructed Large data (raw or prepared) Prepared data Aggregated data Performance

26 Velocity of Decision Making

27 Decouple Storage from Compute

28

29 Customer Examples

30 Data Platform Fast Storage Data Viz Events Data Amazon Redshift Operations Data Amazon S3 Data Services Data Processors kragle metacat Data Exploration portal Re:Invent - Tableau Rules of Engagement in the Cloud - December 1, 2016

31

32

33 atscale CDH Data lake Warehouse Tableau Server + Desktop SQL + Feeds + CRM + Other Hadoop (HDFS Storage)

34 Data Platform Fast Storage Data Viz Events Data Amazon Redshift Operations Data Amazon S3 Data Services Data Processors kragle metacat Data Exploration portal Re:Invent - Tableau Rules of Engagement in the Cloud - December 1, 2016

35 Redshift Criteria Fast analytic queries OnDemand reports SQL Client and Reporting tool support Scalable Prototype

36 Access Points Use Cases Last Mile Data Prep/ Data Science Business Intelligence/Self service Analytics Data Warehousing/ Metadata Management Compute/ Query Serverless Spark SQL Serverless PySpark Redshift/ Druid/ Interana AWS Glue EMR/Spark Streaming (Java/ Scala) Data Layer Data Lake (S3)

37 Updated messaging platform with Amazon Kinesis Data Scientists Analysts Researchers Product Analytics Added Amazon Athena to our analytics ecosystem in support of discovery and ad-analyses New configuration helps us maximize benefit while reducing costs without jeopardizing speed or performance Successfully deployed Tableau Server to ~1,900 internal users in effort to scale selfservice analytics Customers Legacy Product 3 rd Party Data NEXTGEN Products Amazon Kinesis Amazon S3 Amazon Athena Amazon Redshift Analytics Ecosystem Tableau Server Discovery & Ad-hoc Analyses Operational Insights & Monitoring Business Stakeholders Editorial Teams Product Mgr Engineering Sales

38

39 High Level Reference Architecture

40 Vendor Big Data Stacks

41 Create an Agile Data Pipeline

42 Create an Agile Data Pipeline Data Store Process Store Analyze Insight

43 Demo

44 Sources and Acknowledgements

45 Sources and Acknowledgements NYC Taxi Data Trips: NYC Taxi - Shape Files: Python Modules: GIS - Fiona / Shapely / Rtree Redshift - Psycopg2 Blog Posts: NYC Taxi Data Inspiration: Todd Schneider AWS EMR with Presto CLI Scripts: Mark Litwintschik Documentation: AWS (EMR, Redshift, Spectrum Etc.) Tableau

46 Recap Takeaways

47 General Embrace continuous improvement Promote Ad Hoc Analyses & Automate what you can Design and Redesign as needs evolve Consider the breadth and depth of use cases Data is multi-channel, multi-layer, even multi sourced Don t always wait for it to be in the lake. Learn from the community You will hear about successes this week. One size doesn't fit all Create Stored Data for Analysis:

48 Please complete the session survey from the Session Details screen in your TC18 app

49 #TC18 Thank you! Contact or CTA info goes here

50