APAC Big Data & Cloud Summit 2013

Size: px
Start display at page:

Download "APAC Big Data & Cloud Summit 2013"

Transcription

1 APAC Big Data & Cloud Summit 2013 Big Data Analytics & Hadoop Use Cases Eddie Toh Server Marketing Manager 21 August 2013

2 From the dawn of civilization until 2003, we humans created 5 Exabyte of information. Now we create that same amount of information in two days! In 2012, the digital universe of data will expand to 2.72 zettabytes (ZB). Then it s predicted to double every two years.

3 Big Data Volume, Velocity, Variety (& Value) In God we trust, all others bring data NASA, Johnson Space Center 7.9 ZB by x more bits in digital universe than stars in the physical universe $600 Bn Potential value to US healthcare >5 Billion People calling, texting, tweeting & browsing on cell phones 450 Billion Business transactions per day by 2020 (IDC) 90% of Data In the world was created in the last 2 years. 100 years Worth of video uploaded to YouTube every 10 days Therapies tailored to a persons genome Decoding the human genome: From 10 years to hours On track to hit <$1000 per person Explosive growth, 30 Tb/month billing data Radical overhaul of customer service: Self service, realtime access 30x performance increase How Will Businesses Manage a 50x Data Growth by 2020 in an Affordable Way?

4 BIG DATA MACHINE GENERATED REQUIRES DIFFERENT APPROACHES Edge HUMAN GENERATED Scale Up Distributed Healthcare Govt. Retail BUSINESS GENERATED

5 Big Data use cases across industries Education Financial Services

6 Democratize data analysis from edge to cloud Intel can deliver end-to-end analytics from the edge intelligent systems to the Datacenter/cloud

7 End-to-End Power Delivery Chain Operation Fuel Supply System Power Plants Transmission System Distribution System Renewable Plants Fuel Source/Storage Energy Storage End-uses & DR Controllers Sensors Data Collection and Processing Predictive Analytics Monitoring, Ingestion, Modeling, Analysis, Coordination & Control

8 Will Big Data be the difference between success and failure of a political campaign? This was the first presidential election campaign where all of the data that was coming into the campaign was successfully collected and centralized. The Obama campaign did a successful job with that; the Romney campaign did not John Aristotle Phillips, Chief Executive of Aristotle International (WSJ 11/29/12)

9 Eric Dishman Video

10 From Intuition to Predictive Analytics Big Data maturity framework Big Data Adoption Introduction Deployment Production Ongoing Application Business Challenges a) Cost Reduction b) Competitive differentiation innovation c) Revenue Growth Revenue Growth Competitive differentiation and innovation Cost Reduction Obtain / maintain customer Loyalty targeted focus Competitive differentiation Innovation Revenue Growth Stabilizing revenue generation Intuition vs. Analytics Future strategies, Day-to-Day operations Enhanced Analytics A data driven enterprise Descriptive Analytics Financial and Operation Management Sales and Marketing Customer services Prescriptive analytics Strategic Business development Research and product Development Enhanced Customer services Predictive analytics Risk Management Real time customer experience Automated resource allocation Value creation and Brand Management

11 Data Usage Big Data Functional Models Where do I start? Data Management Big Data Infrastructure Domain Expertise Data Science Hadoop Framework Bus. Strategy KPIs LOB Reporting Visual Structures Machine Learning Algorithm Analysis Integration Query Performance Transport Transformation Warehousing Efficiency Trust Workload Governance Tools Network Compute Storage Value Asking the right question Transform question to algorithm Data Ingestion and Processing Driving Value from Big Data depends on Quality, Accuracy and Efficiency

12 Lastly Intel Distribution for Apache Hadoop* Performance Intel Architecture Management Security

13 Backup

14 Flume Log Collector Zookeeper Coordination Sqoop Data Exchange HBase Columnar Store Intel* Distribution for Apache Hadoop What did we launch Intel Manager for Apache Hadoop software Deployment, Configuration, Monitoring, Alerts, and Security Oozie Workflow Pig Scripting Mahout Machine Learning R connector s Statistics YARN (MRv2) Distributed Processing Framework HDFS Hadoop Distributed File System Hive SQL Query Intel unique Intel enhancements contributed back to open source Open source components included without change Focus on near real-time analytics w/ HBase & Hive enhancements Access control, encryption, secure data movement Job throughput efficiency for HDFS Dynamic replication for HDFS & HBase Intel optimized total solution architecture -distro, storage, network, compute X Performance for Real-time jobs Open Sourc e Optimized Intel IA/Distro HBase as the data store. Query all CDR in month Inserting records/second/server Read from disk: >400 query/second/server