APAC Big Data & Cloud Summit 2013

Similar documents
Transcription:

APAC Big Data & Cloud Summit 2013 Big Data Analytics & Hadoop Use Cases Eddie Toh Server Marketing Manager 21 August 2013

From the dawn of civilization until 2003, we humans created 5 Exabyte of information. Now we create that same amount of information in two days! In 2012, the digital universe of data will expand to 2.72 zettabytes (ZB). Then it s predicted to double every two years.

Big Data Volume, Velocity, Variety (& Value) In God we trust, all others bring data NASA, Johnson Space Center 7.9 ZB by 2015 3x more bits in digital universe than stars in the physical universe $600 Bn Potential value to US healthcare >5 Billion People calling, texting, tweeting & browsing on cell phones 450 Billion Business transactions per day by 2020 (IDC) 90% of Data In the world was created in the last 2 years. 100 years Worth of video uploaded to YouTube every 10 days Therapies tailored to a persons genome Decoding the human genome: From 10 years to hours On track to hit <$1000 per person Explosive growth, 30 Tb/month billing data Radical overhaul of customer service: Self service, realtime access 30x performance increase How Will Businesses Manage a 50x Data Growth by 2020 in an Affordable Way?

BIG DATA MACHINE GENERATED REQUIRES DIFFERENT APPROACHES Edge HUMAN GENERATED Scale Up Distributed Healthcare Govt. Retail BUSINESS GENERATED

Big Data use cases across industries Education Financial Services

Democratize data analysis from edge to cloud Intel can deliver end-to-end analytics from the edge intelligent systems to the Datacenter/cloud

End-to-End Power Delivery Chain Operation Fuel Supply System Power Plants Transmission System Distribution System Renewable Plants Fuel Source/Storage Energy Storage End-uses & DR Controllers Sensors Data Collection and Processing Predictive Analytics Monitoring, Ingestion, Modeling, Analysis, Coordination & Control

Will Big Data be the difference between success and failure of a political campaign? This was the first presidential election campaign where all of the data that was coming into the campaign was successfully collected and centralized. The Obama campaign did a successful job with that; the Romney campaign did not John Aristotle Phillips, Chief Executive of Aristotle International (WSJ 11/29/12)

Eric Dishman Video

From Intuition to Predictive Analytics Big Data maturity framework Big Data Adoption Introduction Deployment Production Ongoing Application Business Challenges a) Cost Reduction b) Competitive differentiation innovation c) Revenue Growth Revenue Growth Competitive differentiation and innovation Cost Reduction Obtain / maintain customer Loyalty targeted focus Competitive differentiation Innovation Revenue Growth Stabilizing revenue generation Intuition vs. Analytics Future strategies, Day-to-Day operations Enhanced Analytics A data driven enterprise Descriptive Analytics Financial and Operation Management Sales and Marketing Customer services Prescriptive analytics Strategic Business development Research and product Development Enhanced Customer services Predictive analytics Risk Management Real time customer experience Automated resource allocation Value creation and Brand Management

Data Usage Big Data Functional Models Where do I start? Data Management Big Data Infrastructure Domain Expertise Data Science Hadoop Framework Bus. Strategy KPIs LOB Reporting Visual Structures Machine Learning Algorithm Analysis Integration Query Performance Transport Transformation Warehousing Efficiency Trust Workload Governance Tools Network Compute Storage Value Asking the right question Transform question to algorithm Data Ingestion and Processing Driving Value from Big Data depends on Quality, Accuracy and Efficiency

Lastly Intel Distribution for Apache Hadoop* Performance Intel Architecture Management Security

Backup

Flume Log Collector Zookeeper Coordination Sqoop Data Exchange HBase Columnar Store Intel* Distribution for Apache Hadoop What did we launch Intel Manager for Apache Hadoop software Deployment, Configuration, Monitoring, Alerts, and Security Oozie Workflow Pig Scripting Mahout Machine Learning R connector s Statistics YARN (MRv2) Distributed Processing Framework HDFS Hadoop Distributed File System Hive SQL Query Intel unique Intel enhancements contributed back to open source Open source components included without change Focus on near real-time analytics w/ HBase & Hive enhancements Access control, encryption, secure data movement Job throughput efficiency for HDFS Dynamic replication for HDFS & HBase Intel optimized total solution architecture -distro, storage, network, compute 3500 5X Performance for Real-time jobs 4000 2000 0 700 Open Sourc e Optimized Intel IA/Distro HBase as the data store. Query all CDR in month Inserting 10000 records/second/server Read from disk: >400 query/second/server