Advancing your Big Data Strategy
|
|
- Darren Mitchell
- 5 years ago
- Views:
Transcription
1
2 Welcome
3 # T C 1 8 Advancing your Big Data Strategy Robbin Cottiss Strategic Customer Consultant Tableau Vindy Krishnan Senior Product Manager Tableau
4 You Know Me
5 And Me DATA TABLEAU AND
6 Audience Poll How many of you are new to Tableau? How many of you are new to Big Data? How many of you have or are moving their Big Data Platform to the cloud?
7 Our Goals for Today Re-introduce Big Data and why it might be just Data Understand how the Tableau Platform Integrates with Modern Data Platforms Extract some patterns and generalize to provide a reference framework Evaluate successful customer strategies and architectures Demo the framework and recap The Goal of Goals: Develop a Platform Mindset
8 What s the Situation?
9 Converging Waves Data Cloud
10 Converging Waves Data Cloud
11 Data Cloud Modern Analytics Platform Tablea u
12 Why Do We Need a Modern Data Architecture?
13 Security & Compliance The Tableau Platform Desktop Browser Mobile Embedded Collaboration Alerting Subscriptions Storytelling Sharing Discussions Analytics Content Discovery Governance Data Prep Visual Ad-hoc Advanced Spatial Calculations Statistics Projects Recommendations Versioning Search Centralized Data Sources Certification Usage Analysis Permissions Data Blending Query Federation Visual Data Prep Auto Data Modeling Extensibility & APIs Data Access Live In-memory Hybrid Connectivity Deployment ON-PREMISES CLOUD HOSTED WINDOWS LINUX MAC MULTI-TENANT
14 Cloud Platform
15 Big Data is just DATA!
16 3Vs of DATA Coexist with the 3Vs of Use Cases
17 Principles of a Modern Data Architecture
18 Key Principles Support multiple outcomes, applications and personas Support multiple sources of data Support multiple tiers of data Decouple storage from compute Create an agile data pipeline
19 Support Multiple Outcomes, Applications and Personas
20 Personas Business Users Data Analysts Data Scientists
21 Support Multiple Sources of Data
22 Types of Data
23
24 Support Multiple Tiers of Data
25 Data Size Data Lakes Store Everything and Anything Unknown Questions with Unknown Answers Unstructured/Data Mining/ Data Science Data Warehouses Analytical Queries Known Questions with Unknown Answers Regularly Refreshed Business Concepts Analytical Databases Precomputed Aggregates Known Questions and Known Answers Analytical Dashboard Already Constructed Large data (raw or prepared) Prepared data Aggregated data Performance
26 Velocity of Decision Making
27 Decouple Storage from Compute
28
29 Customer Examples
30 Data Platform Fast Storage Data Viz Events Data Amazon Redshift Operations Data Amazon S3 Data Services Data Processors kragle metacat Data Exploration portal Re:Invent - Tableau Rules of Engagement in the Cloud - December 1, 2016
31
32
33 atscale CDH Data lake Warehouse Tableau Server + Desktop SQL + Feeds + CRM + Other Hadoop (HDFS Storage)
34 Data Platform Fast Storage Data Viz Events Data Amazon Redshift Operations Data Amazon S3 Data Services Data Processors kragle metacat Data Exploration portal Re:Invent - Tableau Rules of Engagement in the Cloud - December 1, 2016
35 Redshift Criteria Fast analytic queries OnDemand reports SQL Client and Reporting tool support Scalable Prototype
36 Access Points Use Cases Last Mile Data Prep/ Data Science Business Intelligence/Self service Analytics Data Warehousing/ Metadata Management Compute/ Query Serverless Spark SQL Serverless PySpark Redshift/ Druid/ Interana AWS Glue EMR/Spark Streaming (Java/ Scala) Data Layer Data Lake (S3)
37 Updated messaging platform with Amazon Kinesis Data Scientists Analysts Researchers Product Analytics Added Amazon Athena to our analytics ecosystem in support of discovery and ad-analyses New configuration helps us maximize benefit while reducing costs without jeopardizing speed or performance Successfully deployed Tableau Server to ~1,900 internal users in effort to scale selfservice analytics Customers Legacy Product 3 rd Party Data NEXTGEN Products Amazon Kinesis Amazon S3 Amazon Athena Amazon Redshift Analytics Ecosystem Tableau Server Discovery & Ad-hoc Analyses Operational Insights & Monitoring Business Stakeholders Editorial Teams Product Mgr Engineering Sales
38
39 High Level Reference Architecture
40 Vendor Big Data Stacks
41 Create an Agile Data Pipeline
42 Create an Agile Data Pipeline Data Store Process Store Analyze Insight
43 Demo
44 Sources and Acknowledgements
45 Sources and Acknowledgements NYC Taxi Data Trips: NYC Taxi - Shape Files: Python Modules: GIS - Fiona / Shapely / Rtree Redshift - Psycopg2 Blog Posts: NYC Taxi Data Inspiration: Todd Schneider AWS EMR with Presto CLI Scripts: Mark Litwintschik Documentation: AWS (EMR, Redshift, Spectrum Etc.) Tableau
46 Recap Takeaways
47 General Embrace continuous improvement Promote Ad Hoc Analyses & Automate what you can Design and Redesign as needs evolve Consider the breadth and depth of use cases Data is multi-channel, multi-layer, even multi sourced Don t always wait for it to be in the lake. Learn from the community You will hear about successes this week. One size doesn't fit all Create Stored Data for Analysis:
48 Please complete the session survey from the Session Details screen in your TC18 app
49 #TC18 Thank you! Contact or CTA info goes here
50