Jason Virtue Business Intelligence Technical Professional jvirtue@microsoft.com
Agenda Microsoft Azure Data Services Azure Cloud Services Azure Machine Learning Azure Service Bus Azure Stream Analytics Azure HDInsight Azure Websites
Cloud Data Sources
Expensive Siloed data Disconnected tools Deployment complexity Huge set-up costs of tools, expertise, and compute/storage capacity create unnecessary barriers to entry Siloed and cumbersome data management restricts access to data Complex and fragmented tools limit participation in exploring data and building models Many models never achieve business value due to difficulties with deploying to production The Cloud Changes the Landscape
Microsoft Azure Data Services Producers Data Transport Storage Analytics Presentation & action Event Hubs (Service Bus) SQL Database Machine Learning Azure Websites Heterogeneous client agents Table/Blob Storage HD Insight Mobile Services External Data Sources { } DocumentDB Stream Analytics Notification Hubs External Data Sources Cloud Services Power BI External Services
DATA SCIENCE is the new BUSINESS INTELLIGENCE Business Intelligence Data Science
Twitter Demo flow and Architecture Twitter Sentiment140 Worker Role Tweet Publisher 1 Worker Role Tweet Publisher 2 Filtered tweets Event Hub Inbound Tweets Stream Analytics Aggregate Tweets by Topics Event Hub Topic Aggregates Worker Role Tweet Consumer Data pushed to clients using SignalR Website Real-Time Dashboard Stream Analytics Aggregate Tweets Event Hub Tweet Aggregates Stream Analytics Archive Tweets Machine Learning Hive Blob Storage Map/Reduce HDFS Power BI
What is in a Cloud Service?
What Can It Run?
Microsoft & Machine Learning Answering questions with experience 1991 1997 2008 2009 2010 2014 2014 Microsoft Research formed Hotmail launches Bing maps launches Bing search launches Kinect launches Skype Translator launches Azure Machine Learning launches Which email is junk? What s the best way home? Which searches are most relevant? What does that motion mean? What is that person saying? What will happen next? John Platt, Distinguished scientist at Microsoft Research Machine learning is pervasive throughout Microsoft products.
PCs/ Laptops POS Terminals Self Checkout Stations Smart Phones Kiosks Slates/ Tablets Point of Service Devices Automation Devices Servers Digital Signs Logic Controllers ATM Security Thin Clients Remote Medical Monitors Vending Machines Handhelds Kinect Specialized Devices Diagnostic Equipment
MapReduce Hive Pig C# Stored Procedures
Original Data Transformed Data Layer-Cake Approach Evolving Approaches to Analytics Traditional EDW Big Data Processing - Ingest first, refinement OnDemand BI Tools Data Marts BI Tools Data Marts Apps Dashboards EDW (SQL Svr, Teradata, etc) Ingest (EL) Original Data Scale-out Storage & Compute (HDFS, Blob Storage, etc) Load Streaming data ETL Tool (SSIS, etc) Transform Transform & Load Extract Dashboards Data Lake(s) Apps
Microsoft HDInsight Hadoop as a Service HBase as a columnar NoSQL transactional database running on Azure Blobs Storm as a streaming service for near real time processing Hadoop 2.4 support for 100x query gains on Hive queries Mahout support for machine learning + Hadoop Graphical User Interface for HIVE queries Coordination Support HBase as NoSQL columnar database on Azure Blobs Support Storm as stream processing HMaster Name Node Region Server Region Server Region Server Region Server Job Tracker Data Node Data Node Data Node Data Node Task Tracker Task Tracker Task Tracker Task Tracker Microsoft Confidential Under Strict NDA
Hadoop is a platform with portfolio of projects
A Hadoop distribution is a package of projects
Programmableweb.com Directory http://azure.microsoft.com/enus/documentation/articles/streamanalytics-twitter-sentiment-analysistrends/
2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.