Next Phase of Evolution in Storage Industry: Impact of Machine Learning Udayan Singh, Head SPE-Storage, Compute & Manageability 30 May 2017 1 Copyright 2017 Tata Consultancy Services Limited
Agenda 1 Digital 5 Forces Impact on Storage Industry Companies Storage Industry Value Chain & Technology Trends. 2 3 AI/ML and the area s of Impact on Storage Industry Use Cases 4 5 Example Implementations 2
Digital 5 Forces Impact on Storage Industry Cloud Business Models Pay as you grow cloud storage based models Analytics & Big Data IOT BI email Online Gaming Image Processing Web 2.0 Media Streaming NoSQL Search Multi Media Static Content Application OLTP ECM Communication ERP Database Industry Vertical Focused HPC Social Networking Products and Services Prescriptive analytics-based issue identification and auto-resolution Next Gen Products: HyperScalers, Hyper Convergence, AFA, Cloud Storage Customer Segments Movement from only B2B to B2B and B2C Partner Network Cloud Storage through partners Mobility connect AI/Robotics Mobility Digital 5 forces impacting multiple dimensions of the Storage Industry Companies 3
Storage Industry Value Chain and Storage Technology Trends ISVs (Enterprise Network Switch 1 Software and Operating 1 Companies System Companies) Hyper Converged Infrastructure Delivering through simplicity 3 1 Storage/Server System OEMs (including Niche Companies) Distribution Companies Cloud Storage Operational costs of maintaining high-growth internal storage infrastructures Material & Packaging Equipment Manufacturers 1 Components Assembly Cloud Storage Companies Increased Collaboration - ISVs, Switch and Storage OEMs 2 End Customer Business Consumer Flash Storage (Hybrid Flash Array/All Flash Array) New interfaces - NVMe Software Defined Data Center Simpler, intuitive and intelligent Data Center Management 2 3 4 Increased competition - Cloud Storage Companies Consolidation - SSD/HDD Companies Emergence Niche Technology Artificial Intelligence / Machine Learning Self-service infrastructure active cognition and analytics-based automation 4
AI/ML and the area s of Impact on Storage Industry Artificial Intelligence Impact on Storage Robotics Sensory Perception PDLC: Requirements Gathering; Development; Quality Assurance; Sustenance Time To Market Cost Reduction Natural Language Processing Customer Support Services User Experience Improved Quality Machine Learning Image Analysis Speech Recognition Natural Language Generation Storage Admin Software : Provisioning; Management; Troubleshooting User Experience Cost Savings Deep Learning Knowledge Engineering End User Storage Software Products: File Storage; User Experience Improved Quality Cognition Storage Devices: SSD, HDD Improved Quality (endurance, performance) 5
Use Cases: Data Center Storage Software Machine Learning Executing Task Learning by experience Improving Performance Cloud Storage Management Data Protection Provision Current Stage Future Stage Storage Monitoring & Management Storage Analytics Customer Support Software Performance Management Monitoring Manage Issue Identification Issue Resolution Optimization Automated Update Manual Analysis Manual Performance Tuning Automated Data Storage Optimization Proactive Management Intelligent Proactive Error detection and Update Automated Root Cause Analysis & Corrective Action Intelligent Performance Tuning Intelligent Data Storage Optimization Self Managed Data Center - Customer Support - Storage Management Software System Health Monitoring Demand Prediction 1 Admin per 4 Pb/500 Servers 1 Admin per 20 Pb/10,000 Servers Storage Software Platform 6
Use Cases: Product Development Phases DevOps - Infrastructure Provisioning Manual triaging for provisioning of Cloud Software Identify potential challenges (Cloud) Build Process/Source code integration: Issue identification and pull back Deployment - Cloud Storage Software C B Integrate, Build, Review & Feedback Automated triaging of Cloud Software and automated resolution. Identify potential challenges proactively A Issue Identification, pull back and RCA (ML Classification) Root Cause Analysis, Report Logging: Manual Triaging Test Data Set/Case Creation Test Automated Triaging and Root Cause Analysis identification (ML Classification) Creation of Test Data Sets and Test Cases B A B C D Simple Medium Little Tough Tough Identify potential codes based on knowledge base and apply Understanding the code base (acquisitions) Capturing Product Requirements: Understand and write the requirements. Develop Initiate Project & Define Requirements 7 Recommend reusable code repositories (Internal and Open Source libraries) Improve documentation to enhance productivity (data model + documentation) Propose improvement to code quality D Capture systems requirements through spoken communication and converting it into PRD (Chatbots + NLP) Copyright 2017 Tata Consultancy Services
Example 1: Storage Performance Machine Learning Method Selection of Right Data Set Quality Check of Data Analysis of the Data Insight Generation Data Pattern Matching Models: MultiVariate Regression Analysis (Causal) Time Series Analysis Final: Multivariate Time Series (VAR Model) Features: Total IOPS Cache Hit Ratio (%) Disk Response Time Disk Utilization Prediction: Average Response Time (Latency) Similarly for- Capacity CPU Failure Disk Failure [%age Write Miss Data Transfer Rate (MB/s) No.of Active Network Ports Spindle Count RAID Configuration Disk Type] 8
Example 2: Prediction of Database Response Time Business Problem Solution To predict the database response time and facing challenges as below To create predictive models which predict DB response time. To predict 1-hour-later SLA violation of DB response time with 90% of recall and 70% of precision. The predictive models predict the Probabilistic distribution of DB response time at the time of 1hour later. Creation of Lab environment to load on DB and measurement of various parameters for model building Predictive models to: Predict number of queries on the DB after 1 hour DB response time after 1 hour To identify load on the system after 1 hour and corrective action planning Techniques - Unobserved component models and Poisson Regression Benefits All models are having Mean Absolute Percentage Error (MAPE) less than 5% Precision & recall rate for 1 hour later prediction was 60% and 45% respectively. 9
Thank You IT Services Business Solutions Consulting 10 studioppt I 05 I 2017