Analysis and Modeling of Time-Correlated Failures in Large-Scale Distributed Systems

Size: px
Start display at page:

Download "Analysis and Modeling of Time-Correlated Failures in Large-Scale Distributed Systems"

Transcription

1 Analysis and Modeling of Time-Correlated Failures in Large-Scale Distributed Systems Nezih Yigitbasi 1, Matthieu Gallet 2, Derrick Kondo 3, Alexandru Iosup 1, Dick Epema TUDelft, 2 École Normale Supérieure de Lyon, 3 INRIA The Failure Trace Archive Delft University of Technology Challenge the future

2 Failures Do Happen Build a computing system with 10 thousand servers with MTBF of 30 years each, watch one fail per day Jeff Dean, Google Fellow, LADIS 09 Keynote Average worker deaths per MapReduce job is 1.2 MapReduce, OSDI % failures in TeraGrid Khalili et al., GRID 06 During the month of March 2005 on one dedicated cluster with 1500 Xeon CPUs, there were 32,580 Sawzall jobs launched, using an average of 220 machines each. While running those jobs, 18,636 failures occurred (application failure, network outage, system crash, etc.) that triggered rerunning some portion of the job... Rob Pike et al., Google 2

3 Are Failures Independent? Common assumption Is this realistic for large-scale distributed systems? Already know that space correlations exist Time correlations may impact Proactive fault-tolerance solutions Design decisions Checkpointing & scheduling decisions (e.g., migrate computation at the beginning of a predicted peak) M.Gallet, N.Yigitbasi, B.Javadi, D.Kondo, A.Iosup, D.Epema, A Model for Space-correlated Failures in Large-scale Distributed Systems, Euro-Par

4 Our Goals GOAL 1 Investigate whether failures have time correlations GOAL 2 Model the time-varying behavior of failures (peaks) 4

5 Outline Background Our Approach Analysis of Time-Correlation Modeling the Peaks of Failures Conclusions 5

6 Why Not Root-Cause Analysis? Root-cause analysis is definitely useful Challenges Systems are large and complex Not all subsystems provide detailed info Little monitoring/debugging support Environment-specific or temporary failures Huge size of failure data 19 systems 6

7 Failure Trace Archive (FTA) Provides Availability traces of diverse distributed systems of different scale Standard format for failure events Tools for parsing & analysis Enables Comparing models/algorithms using identical data sets Evaluation of the generality/specificity of models/algorithms across different types of systems Analysis of availability evolution across time scales And many more The Failure Trace Archive 7

8 FTA Schema Hierarchical trace format Resource centric Event-based Associated metadata Codes for different components and events Available in raw, tabbed and MYSQL formats 8

9 Sample Trace Identifiers Type Event of for event: the start/stop event/component/node/platform Node unavailability/availability name time (UNIX time) 9

10 Outline Background Our Approach Analysis of Time-Correlation Modeling the Peaks of Failures Conclusions 10

11 Our Approach (1): Outline Traces Nineteen failure traces from the FTA Mostly production systems Analysis Use the auto-correlation of failure rate time series Modeling Fit well-known probability distributions to the failure data to model failure peaks 11

12 Our Approach (2): Traces 100K+ hosts ~1.2 M failure events 15+ years of operation in total 12

13 Our Approach (3): Analysis Auto-Correlation Function (ACF) Similarity between observations as a function of the time lag between them Mathematical tool for finding repeating patterns Used for assessing time correlations [-1 1]: weak strong correlation 13

14 Our Approach (4): Modeling We use five probability distributions to fit to the empirical data Exponential, Weibull, Pareto, Log-Normal, and Gamma Maximum likelihood estimation + Goodness of Fit Tests 14

15 Outline Background Our Approach Analysis of Time-Correlation Modeling the Peaks of Failures Conclusions 15

16 Analysis (1): Auto-correlation WEBSITES Many systems exhibit moderate/strong auto-correlation for moderate/short time lags (GRID5K, LDNS, SKYPE, ) 16

17 Analysis (2): Auto-correlation TERAGRID Small number of systems exhibit low autocorrelation (TeraGrid, PNNL, NOTRE-DAME) 17

18 Analysis (3): Failure Patterns Daily/Weekly Cycles Daily/Weekly Cycles MICROSOFT SKYPE Systems with similar usage patterns have similar failure patterns 18

19 Analysis (4): Workload Intensity vs Failure Rate GRID5000 There is a strong correlation between the workload intensity and the failure rate in some systems 19

20 Outline Background Our Approach Analysis of Time-Correlation Modeling the Peaks of Failures Conclusions 20

21 Failure Peaks (1): Model μ+kσ μ 21

22 Failure Peaks (2): Identification Our goal Balance between capturing the extreme system behavior and characterizing an important part of the system failures We use a threshold to isolate peaks μ + kσ where k is a positive integer Large k=> Few periods explaining only a small fraction of failures Small k=> More failures of probably very different characteristics We use k=1 Tried k={0.5, 0.9, 1.0, 1.1, 1.25, 1.5, 2.0} Over all traces, average fraction of downtime and average number of failures are close (see Technical Report) 22

23 Failure Peaks (3): Modeling Results (1) 1. On average, 50% - 95% of the system downtime is caused by the failures that originate during peaks, but the fraction of peaks < 10% for all platforms 2. The average peak durations are on the order of 1-2 hours 3. The average time between peaks is on the order of hours 4. Average IAT over the entire trace is about 9x the IAT during peaks 23

24 Failure Peaks (4): Modeling Results (2) 5. Exponential distribution is not a good fit for IAT during peaks, time between peaks, and failure duration during peaks Traditional models are not enough 6. Model parameters do not follow a heavy-tailed distribution Goodness of fit test results (p-values) for the Pareto distribution are very low 7. Weibull and the Log-Normal provide the best fit See the paper for the parameters 24

25 Conclusions (1) Large-Scale Study Nineteen traces most of which are production systems 100K+ hosts ~1.2 M failure events 15+ years of operation Four new traces available in the FTA (3 CONDOR + 1 TERAGRID) GOAL 1: Analysis Failures exhibit strong periodic behavior & time correlation Systems with similar usage patterns have similar failure patterns Strong correlation between workload intensity and failure rate 25

26 Conclusions (2) GOAL 2: Modeling Peak duration, time between peaks, the failure IAT during peaks, and the failure duration during peaks On average 50% - 95% of the system downtime is caused by the failures that originate during peaks (fraction of peaks < 10%) Weibull & the Log-Normal distributions provide good fit 26

27 Thank you! Questions? Comments? The Failure Trace Archive More Information: Guard-g Project: The Failure Trace Archive: PDS publication database: 27

Build-and-Test Workloads for Grid Middleware Problem, Analysis, and Applications

Build-and-Test Workloads for Grid Middleware Problem, Analysis, and Applications Build-and-Test Workloads for Grid Middleware Problem, Analysis, and Applications PDS Group, EEMCS, TU Delft Alexandru Iosup and Dick H.J. Epema CS Dept., U. Wisconsin-Madison Peter Couvares, Anatoly Karp,

More information

Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis

Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis Charles Reiss *, Alexey Tumanov, Gregory R. Ganger, Randy H. Katz *, Michael A. Kozuch * UC Berkeley CMU Intel Labs http://www.istc-cc.cmu.edu/

More information

Spark, Hadoop, and Friends

Spark, Hadoop, and Friends Spark, Hadoop, and Friends (and the Zeppelin Notebook) Douglas Eadline Jan 4, 2017 NJIT Presenter Douglas Eadline deadline@basement-supercomputing.com @thedeadline HPC/Hadoop Consultant/Writer http://www.basement-supercomputing.com

More information

Integrated Service Management

Integrated Service Management Integrated Service Management for Power servers As the world gets smarter, demands on the infrastructure will grow Smart traffic systems Smart Intelligent food oil field technologies systems Smart water

More information

On Cloud Computational Models and the Heterogeneity Challenge

On Cloud Computational Models and the Heterogeneity Challenge On Cloud Computational Models and the Heterogeneity Challenge Raouf Boutaba D. Cheriton School of Computer Science University of Waterloo WCU IT Convergence Engineering Division POSTECH FOME, December

More information

Intro to Big Data and Hadoop

Intro to Big Data and Hadoop Intro to Big and Hadoop Portions copyright 2001 SAS Institute Inc., Cary, NC, USA. All Rights Reserved. Reproduced with permission of SAS Institute Inc., Cary, NC, USA. SAS Institute Inc. makes no warranties

More information

inteliscaler Workload and Resource Aware, Proactive Auto Scaler for PaaS Cloud

inteliscaler Workload and Resource Aware, Proactive Auto Scaler for PaaS Cloud inteliscaler Workload and Resource Aware, Proactive Auto Scaler for PaaS Cloud Paper #10368 RS Shariffdeen, UKJU Bandara, DTSP Munasinghe, HS Bhathiya, and HMN Dilum Bandara Dept. of Computer Science &

More information

The Importance of Complete Data Sets for Job Scheduling Simulations

The Importance of Complete Data Sets for Job Scheduling Simulations The Importance of Complete Data Sets for Job Scheduling Simulations Dalibor Klusáček, Hana Rudová Faculty of Informatics, Masaryk University, Brno, Czech Republic {xklusac, hanka}@fi.muni.cz 15th Workshop

More information

Frameworks for massively parallel computing: Massively inefficient?

Frameworks for massively parallel computing: Massively inefficient? Frameworks for massively parallel computing: Massively inefficient? Bianca Schroeder * (joint with Nosayba El-Sayed) University of Toronto * Currently Visiting Scientist @Google Some background Main interest:

More information

Customer Challenges SOLUTION BENEFITS

Customer Challenges SOLUTION BENEFITS SOLUTION BRIEF Matilda Cloud Solutions simplify migration of your applications to a public or private cloud, then monitor and control the environment for ongoing IT operations. Our solution empowers businesses

More information

Uncovering the Hidden Truth In Log Data with vcenter Insight

Uncovering the Hidden Truth In Log Data with vcenter Insight Uncovering the Hidden Truth In Log Data with vcenter Insight April 2014 VMware vforum Istanbul 2014 Serdar Arıcan 2014 VMware Inc. All rights reserved. VMware Strategy To help customers realize the promise

More information

Analyzing Real Cluster Data for Formulating Allocation Algorithms in Cloud Platforms

Analyzing Real Cluster Data for Formulating Allocation Algorithms in Cloud Platforms Analyzing Real Cluster Data for Formulating Allocation Algorithms in Cloud Platforms Olivier Beaumont, Lionel Eyraud-Dubois, Juan-Angel Lorenzo-Del-Castillo To cite this version: Olivier Beaumont, Lionel

More information

Cluster management at Google

Cluster management at Google Cluster management at Google LISA 2013 john wilkes (johnwilkes@google.com) Software Engineer, Google, Inc. We own and operate data centers around the world http://www.google.com/about/datacenters/inside/locations/

More information

Exploring Non-Homogeneity and Dynamicity of High Scale Cloud through Hive and Pig

Exploring Non-Homogeneity and Dynamicity of High Scale Cloud through Hive and Pig Exploring Non-Homogeneity and Dynamicity of High Scale Cloud through Hive and Pig Kashish Ara Shakil, Mansaf Alam(Member, IAENG) and Shuchi Sethi Abstract Cloud computing deals with heterogeneity and dynamicity

More information

vsom vsphere with Operations

vsom vsphere with Operations vsom vsphere with Operations Maciej Kot Senior System Engineer VMware Inc. 2014 VMware Inc. All rights reserved. Agenda 1 Introduction 2 3 vcenter Operations Manager Overview vcenter Operations Manager

More information

Live Video Analytics at Scale with Approximation and Delay-Tolerance

Live Video Analytics at Scale with Approximation and Delay-Tolerance Live Video Analytics at Scale with Approximation and Delay-Tolerance Haoyu Zhang, Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, Paramvir Bahl, Michael J. Freedman Video cameras are pervasive

More information

Digital Transformation of Energy Systems

Digital Transformation of Energy Systems DNV GL Energy Digital Transformation of Energy Systems A holistic approach to digitization of utility system operations through effective data management 1 SAFER, SMARTER, GREENER DNV GL: Global classification,

More information

Workload Characteristics of a Multi-cluster Supercomputer

Workload Characteristics of a Multi-cluster Supercomputer Workload Characteristics of a Multi-cluster Supercomputer Hui Li, David Groep 2, and Lex Wolters Leiden Institute of Advanced Computer Science (LIACS), Leiden University, the Netherlands. 2 National Institute

More information

COMPARE VMWARE. Business Continuity and Security. vsphere with Operations Management Enterprise Plus. vsphere Enterprise Plus Edition

COMPARE VMWARE. Business Continuity and Security. vsphere with Operations Management Enterprise Plus. vsphere Enterprise Plus Edition COMPARE VMWARE vsphere EDITIONS Business Continuity and Security vmotion Enables live migration of virtual machines with no disruption to users or loss of service, eliminating the need to schedule application

More information

IBM Tivoli Monitoring

IBM Tivoli Monitoring Monitor and manage critical resources and metrics across disparate platforms from a single console IBM Tivoli Monitoring Highlights Proactively monitor critical components Help reduce total IT operational

More information

IBM Tivoli Workload Scheduler

IBM Tivoli Workload Scheduler Manage mission-critical enterprise applications with efficiency IBM Tivoli Workload Scheduler Highlights Drive workload performance according to your business objectives Help optimize productivity by automating

More information

INTER CA NOVEMBER 2018

INTER CA NOVEMBER 2018 INTER CA NOVEMBER 2018 Sub: ENTERPRISE INFORMATION SYSTEMS Topics Information systems & its components. Section 1 : Information system components, E- commerce, m-commerce & emerging technology Test Code

More information

Process Optimization Training For Efficient Mobile Network Operations. Objectives Content Delivery

Process Optimization Training For Efficient Mobile Network Operations. Objectives Content Delivery Process Optimization Training For Efficient Mobile Network Operations Objectives Content Delivery V3.0, March 2009, Slide 2 The Challenge: NGN Transition from network-centric operators to customer-centric

More information

Plan Your Work, Work Your Plan. Dr. R. Rockland Chair and Professor, Department of Engineering Technology New Jersey Institute of Technology

Plan Your Work, Work Your Plan. Dr. R. Rockland Chair and Professor, Department of Engineering Technology New Jersey Institute of Technology Plan Your Work, Work Your Plan Dr. R. Rockland Chair and Professor, Department of Engineering Technology New Jersey Institute of Technology Agenda Understand what a project is Understand the basics of

More information

GUIDE The Enterprise Buyer s Guide to Public Cloud Computing

GUIDE The Enterprise Buyer s Guide to Public Cloud Computing GUIDE The Enterprise Buyer s Guide to Public Cloud Computing cloudcheckr.com Enterprise Buyer s Guide 1 When assessing enterprise compute options on Amazon and Azure, it pays dividends to research the

More information

Resource Scheduling Architectural Evolution at Scale and Distributed Scheduler Load Simulator

Resource Scheduling Architectural Evolution at Scale and Distributed Scheduler Load Simulator Resource Scheduling Architectural Evolution at Scale and Distributed Scheduler Load Simulator Renyu Yang Supported by Collaborated 863 and 973 Program Resource Scheduling Problems 2 Challenges at Scale

More information

Special thanks to Chad Diaz II, Jason Montgomery & Micah Torres

Special thanks to Chad Diaz II, Jason Montgomery & Micah Torres Special thanks to Chad Diaz II, Jason Montgomery & Micah Torres Outline: What cloud computing is The history of cloud computing Cloud Services (Iaas, Paas, Saas) Cloud Computing Service Providers Technical

More information

St Louis CMG Boris Zibitsker, PhD

St Louis CMG Boris Zibitsker, PhD ENTERPRISE PERFORMANCE ASSURANCE BASED ON BIG DATA ANALYTICS St Louis CMG Boris Zibitsker, PhD www.beznext.com bzibitsker@beznext.com Abstract Today s fast-paced businesses have to make business decisions

More information

Grid computing workloads

Grid computing workloads Grid computing workloads Iosup, A.; Epema, D.H.J. Published in: IEEE Internet Computing DOI: 1.119/MIC.21.13 Published: 1/1/211 Document Version Publisher s PDF, also known as Version of Record (includes

More information

10/1/2013 BOINC. Volunteer Computing - Scheduling in BOINC 5 BOINC. Challenges of Volunteer Computing. BOINC Challenge: Resource availability

10/1/2013 BOINC. Volunteer Computing - Scheduling in BOINC 5 BOINC. Challenges of Volunteer Computing. BOINC Challenge: Resource availability Volunteer Computing - Scheduling in BOINC BOINC The Berkley Open Infrastructure for Network Computing Ryan Stern stern@cs.colostate.edu Department of Computer Science Colorado State University A middleware

More information

A FRAMEWORK FOR CAPACITY ANALYSIS D E B B I E S H E E T Z P R I N C I P A L C O N S U L T A N T M B I S O L U T I O N S

A FRAMEWORK FOR CAPACITY ANALYSIS D E B B I E S H E E T Z P R I N C I P A L C O N S U L T A N T M B I S O L U T I O N S A FRAMEWORK FOR CAPACITY ANALYSIS D E B B I E S H E E T Z P R I N C I P A L C O N S U L T A N T M B I S O L U T I O N S Presented at St. Louis CMG Regional Conference, 4 October 2016 (c) MBI Solutions

More information

Actual4Test. Actual4test - actual test exam dumps-pass for IT exams

Actual4Test.   Actual4test - actual test exam dumps-pass for IT exams Actual4Test http://www.actual4test.com Actual4test - actual test exam dumps-pass for IT exams Exam : C2090-623 Title : IBM Cognos Analytics Administrator V1 Vendor : IBM Version : DEMO Get Latest & Valid

More information

BARCELONA. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

BARCELONA. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved BARCELONA 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Optimizing Cost and Efficiency on AWS Inigo Soto Practice Manager, AWS Professional Services 2015, Amazon Web Services,

More information

Oracle Communications Billing and Revenue Management Elastic Charging Engine Performance. Oracle VM Server for SPARC

Oracle Communications Billing and Revenue Management Elastic Charging Engine Performance. Oracle VM Server for SPARC Oracle Communications Billing and Revenue Management Elastic Charging Engine Performance Oracle VM Server for SPARC Table of Contents Introduction 1 About Oracle Communications Billing and Revenue Management

More information

Case Study BONUS CHAPTER 2

Case Study BONUS CHAPTER 2 BONUS CHAPTER 2 Case Study ABC is a large accounting firm with customers in five countries across North America and Europe. Its North American headquarters is located in Miami, Florida, where it hosts

More information

Using SAP with HP Virtualization and Partitioning

Using SAP with HP Virtualization and Partitioning Using SAP with HP Virtualization and Partitioning Introduction... 2 Overview of Virtualization and Partitioning Technologies... 2 Physical Servers... 2 Hard Partitions npars... 3 Virtual Partitions vpars...

More information

IBM Tivoli OMEGAMON XE on z/vm and Linux

IBM Tivoli OMEGAMON XE on z/vm and Linux Manage and monitor z/vm and Linux performance IBM Tivoli OMEGAMON XE on z/vm and Linux Highlights Facilitate the cost-effective migration of workloads onto mainframes by monitoring z/vm and Linux performance

More information

Leveraging Renewable Energy in Data Centers

Leveraging Renewable Energy in Data Centers Leveraging Renewable Energy in Data Centers Ricardo Bianchini Department of Computer Science Collaborators: Inigo Goiri, Jordi Guitart (UPC/BSC), Md. Haque, William Katsak, Kien Le, Thu D. Nguyen, and

More information

Multi-Resource Packing for Cluster Schedulers. CS6453: Johan Björck

Multi-Resource Packing for Cluster Schedulers. CS6453: Johan Björck Multi-Resource Packing for Cluster Schedulers CS6453: Johan Björck The problem Tasks in modern cluster environments have a diverse set of resource requirements CPU, memory, disk, network... The problem

More information

Effective Straggler Mitigation

Effective Straggler Mitigation GZ06: Mobile and Cloud Computing Effective Straggler Mitigation Attack of the Clones Greg Lyras & Johann Mifsud Outline Introduction Related Work Goals Design Results Evaluation Summary What are Stragglers?

More information

Optimizing Grid-Based Workflow Execution

Optimizing Grid-Based Workflow Execution Journal of Grid Computing (2006) 3: 201 219 # Springer 2006 DOI: 10.1007/s10723-005-9011-7 Optimizing Grid-Based Workflow Execution Gurmeet Singh j, Carl Kesselman and Ewa Deelman Information Sciences

More information

Smart Monitoring System For Automatic Anomaly Detection and Problem Diagnosis. Xianping Qu March 2015

Smart Monitoring System For Automatic Anomaly Detection and Problem Diagnosis. Xianping Qu March 2015 Smart Monitoring System For Automatic Anomaly Detection and Problem Diagnosis Xianping Qu quxianping@baidu.com March 2015 Who am I? Xianping Qu Senior Engineer, SRE team, Baidu quxianping@baidu.com Baidu

More information

Production Loss Accounting with the PI System and RtDuet

Production Loss Accounting with the PI System and RtDuet Production Loss Accounting with the PI System and RtDuet Presented by Paul Yaroshak, Senior Process Systems Engineer Barrick Gold Corporation Pueblo Viejo Production Loss Accounting with the PI System

More information

Harvester. Tadashi Maeno (BNL)

Harvester. Tadashi Maeno (BNL) Harvester Tadashi Maeno (BNL) Outline Motivation Design Workflows Plans 2 Motivation 1/2 PanDA currently relies on server-pilot paradigm PanDA server maintains state and manages workflows with various

More information

ManageEngine Applications Manager in Financial Domain

ManageEngine Applications Manager in Financial Domain ManageEngine Applications Manager in Financial Domain Abstract: A leading bank with thousands of branch offices deployed Applications Manager to monitor their back office applications used in different

More information

Product and Program Updates

Product and Program Updates TIDAL WEBINAR: CATCH THE WAVE! Product and Program Updates 2018-11 Evolution of Product Strategy in 2018 Tactical Focus Reduce time to resolution of product issues Address backlog of customer requests

More information

Resource Management for Rapid Application Turnaround on Enterprise Desktop Grids

Resource Management for Rapid Application Turnaround on Enterprise Desktop Grids Resource Management for Rapid Application Turnaround on Enterprise Desktop Grids Derrick Kondo, Andrew A. Chien, Henri Casanova Computer Science and Engineering Department San Diego Supercomputer Center

More information

IBM Emptoris Supplier Lifecycle Management on Cloud

IBM Emptoris Supplier Lifecycle Management on Cloud Service Description IBM Emptoris Supplier Lifecycle Management on Cloud This Service Description describes the Cloud Service IBM provides to Client. Client means the contracting party and its authorized

More information

Ensure Your Servers Can Support All the Benefits of Virtualization and Private Cloud The State of Server Virtualization... 8

Ensure Your Servers Can Support All the Benefits of Virtualization and Private Cloud The State of Server Virtualization... 8 ... 4 The State of Server Virtualization... 8 Virtualization Comfort Level SQL Server... 12 Case in Point SAP... 14 Virtualization The Server Platform Really Matters... 18 The New Family of Intel-based

More information

Cloud Service Model. Selecting a cloud service model. Different cloud service models within the enterprise

Cloud Service Model. Selecting a cloud service model. Different cloud service models within the enterprise Cloud Service Model Selecting a cloud service model Different cloud service models within the enterprise Single cloud provider AWS for IaaS Azure for PaaS Force fit all solutions into the cloud service

More information

IBM Tivoli Workload Automation View, Control and Automate Composite Workloads

IBM Tivoli Workload Automation View, Control and Automate Composite Workloads Tivoli Workload Automation View, Control and Automate Composite Workloads Mark A. Edwards Market Manager Tivoli Workload Automation Corporation Tivoli Workload Automation is used by customers to deliver

More information

10 Ways Oracle Cloud Is Better Than AWS

10 Ways Oracle Cloud Is Better Than AWS 10 Ways Oracle Cloud Is Better Than AWS BY UMAIR MANSOOB Who Am I Oracle Certified Administrator from Oracle 7 12c Exadata Certified Implementation Specialist since 2011 Oracle Database Performance Tuning

More information

Datametica. The Modern Data Platform Enterprise Data Hub Implementations. Why is workload moving to Cloud

Datametica. The Modern Data Platform Enterprise Data Hub Implementations. Why is workload moving to Cloud Datametica The Modern Data Platform Enterprise Data Hub Implementations Why is workload moving to Cloud 1 What we used do Enterprise Data Hub & Analytics What is Changing Why it is Changing Enterprise

More information

IBM storage solutions: Evolving to an on demand operating environment

IBM storage solutions: Evolving to an on demand operating environment May 2003 IBM TotalStorage IBM storage solutions: Evolving to an on demand operating environment Page No.1 Contents 1 e-business on demand 1 Integrated information fuels on demand businesses 2 Integrated

More information

Release 12.2 Beta Program

Release 12.2 Beta Program Release 12.2 Beta Program By Gustavo Gonzalez Taking Our Own Medicine Used the E-Business Suite since 2004 Upgraded to R12 in January 2009 Implemented OBIEE in January 2010 R12.2 Beta Program in January

More information

The Evolution of Analytics

The Evolution of Analytics The Evolution of Analytics Ed Colet Capital One Financial Corporation SAS Global Forum, Executive Track Presentation April, 2011 Outline Looking back at the evolution of analytics Standard views, and the

More information

Defining and Measuring Red Storm Reliability, Availability, and Serviceability (RAS)

Defining and Measuring Red Storm Reliability, Availability, and Serviceability (RAS) Defining and Measuring Red Storm Reliability, Availability, and Serviceability (RAS) Jon Stearley Sandia National Laboratories May 18, 2005 Cray Users Group 2005 Conference Outline Problem: Can t agree

More information

VirtualWisdom Analytics Overview

VirtualWisdom Analytics Overview DATASHEET VirtualWisdom Analytics Overview Today s operations are faced with an increasing dynamic hybrid infrastructure of near infinite scale, new apps appear and disappear on a daily basis, making the

More information

But at the same time, I need to have proof of it. As I provide proof of what I wrote with numbers and tests (all available on github).

But at the same time, I need to have proof of it. As I provide proof of what I wrote with numbers and tests (all available on github). Recently Fred published a post ( http://lefred.be/content/mysql-group-replication-is-sweet-but-c an-be-sour-if-you-misunderstand-it) in which he was stating, I had publish my blog ( http ://www.tusacentral.net/joomla/index.php/mysql-blogs/191-group-replication-sweet-a-sour.html)

More information

Grid Resource Availability Prediction-Based Scheduling and Task Replication

Grid Resource Availability Prediction-Based Scheduling and Task Replication J Grid Computing (29) manuscript No. (will be inserted by the editor) Grid Resource Availability Prediction-Based Scheduling and Task Replication Brent Rood Michael J. Lewis Received: date / Accepted:

More information

DIET: New Developments and Recent Results

DIET: New Developments and Recent Results A. Amar 1, R. Bolze 1, A. Bouteiller 1, A. Chis 1, Y. Caniou 1, E. Caron 1, P.K. Chouhan 1, G.L. Mahec 2, H. Dail 1, B. Depardon 1, F. Desprez 1, J. S. Gay 1, A. Su 1 LIP Laboratory (UMR CNRS, ENS Lyon,

More information

The concepts described herein apply to all versions of IBM Cognos 8 BI and IBM Cognos 10 BI.

The concepts described herein apply to all versions of IBM Cognos 8 BI and IBM Cognos 10 BI. Introduction Purpose This document is meant to supplement the Security and Administration Guide and Architecture and Deployment Guide which are part of the IBM Cognos BI product documentation. It will

More information

Services Guide April The following is a description of the services offered by PriorIT Consulting, LLC.

Services Guide April The following is a description of the services offered by PriorIT Consulting, LLC. SERVICES OFFERED The following is a description of the services offered by PriorIT Consulting, LLC. Service Descriptions: Strategic Planning: An enterprise GIS implementation involves a considerable amount

More information

Taking Advantage of Cloud Elasticity and Flexibility

Taking Advantage of Cloud Elasticity and Flexibility Taking Advantage of Cloud Elasticity and Flexibility Fred Koopmans Sr. Director of Product Management 1 Public cloud adoption is surging 2 Cloudera customers are leading the way 3 Hadoop was born for the

More information

Leveraging smart meter data for electric utilities:

Leveraging smart meter data for electric utilities: Leveraging smart meter data for electric utilities: Comparison of Spark SQL with Hive 5/16/2017 Hitachi, Ltd. OSS Solution Center Yusuke Furuyama Shogo Kinoshita Who are we? Yusuke Furuyama Solutions engineer

More information

INFOBrief. EMC VisualSRM Storage Resource Management Suite. Key Points

INFOBrief. EMC VisualSRM Storage Resource Management Suite. Key Points INFOBrief EMC VisualSRM Storage Resource Management Suite Key Points EMC VisualSRM is data center-class software specifically architected to provide centralized storage resource management for mid-tier

More information

MegaRAC XMS Client Management Suite

MegaRAC XMS Client Management Suite MegaRAC XMS Client Management Suite For Easy and Effective Management Joseprabu Inbaraj MegaRAC XMS is a centralized management server that is architected with extendibility in mind. Client Management

More information

Leveraging smart meter data for electric utilities:

Leveraging smart meter data for electric utilities: Leveraging smart meter data for electric utilities: Comparison of Spark SQL with Hive 5/16/2017 Hitachi, Ltd. OSS Solution Center Yusuke Furuyama Shogo Kinoshita Who are we? Yusuke Furuyama Solutions engineer

More information

5th Annual. Cloudera, Inc. All rights reserved.

5th Annual. Cloudera, Inc. All rights reserved. 5th Annual 1 The Essentials of Apache Hadoop The What, Why and How to Meet Agency Objectives Sarah Sproehnle, Vice President, Customer Success 2 Introduction 3 What is Apache Hadoop? Hadoop is a software

More information

Whatever Happened to Rosey Jetson? How Banks/ Retailers/ Processors/ Networks Could Use Artificial Intelligence in Day-to- Day Operations

Whatever Happened to Rosey Jetson? How Banks/ Retailers/ Processors/ Networks Could Use Artificial Intelligence in Day-to- Day Operations Whatever Happened to Rosey Jetson? How Banks/ Retailers/ Processors/ Networks Could Use Artificial Intelligence in Day-to- Day Operations Kevin Johnson Christopher Souser Thursday March 1 st, 2018 4:30

More information

All Events. One Platform.

All Events. One Platform. All Events. One Platform. Industry s first IT ops platform that truly correlates the metric, flow and log events and turns them into actionable insights 2 Motadata brought a refreshing experience against

More information

February 14, 2006 GSA-WG at GGF16 Athens, Greece. Ignacio Martín Llorente GridWay Project

February 14, 2006 GSA-WG at GGF16 Athens, Greece. Ignacio Martín Llorente GridWay Project February 14, 2006 GSA-WG at GGF16 Athens, Greece GridWay Scheduling Architecture GridWay Project www.gridway.org Distributed Systems Architecture Group Departamento de Arquitectura de Computadores y Automática

More information

IBM High Performance Services for Hadoop

IBM High Performance Services for Hadoop IBM Terms of Use SaaS Specific Offering Terms IBM High Performance Services for Hadoop The Terms of Use ( ToU ) is composed of this IBM Terms of Use - SaaS Specific Offering Terms ( SaaS Specific Offering

More information

Gandiva: Introspective Cluster Scheduling for Deep Learning

Gandiva: Introspective Cluster Scheduling for Deep Learning Gandiva: Introspective Cluster Scheduling for Deep Learning Wencong Xiao, Romil Bhardwaj, Ramachandran Ramjee, Muthian Sivathanu, Nipun Kwatra, Zhenhua Han, Pratyush Patel, Xuan Peng, Hanyu Zhao, Quanlu

More information

Digital Transformation of Energy Systems

Digital Transformation of Energy Systems DNV GL Energy Digital Transformation of Energy Systems A holistic approach to digitization of utility system operations through effective data management 1 SAFER, SMARTER, GREENER DNV GL: Global classification,

More information

IBM Virtualization Manager Xen Summit, April 2007

IBM Virtualization Manager Xen Summit, April 2007 IBM Virtualization Manager Xen Summit, April 2007 Senthil Bakthavachalam 2006 IBM Corporation The Promise of Virtualization System Administrator Easily deploy new applications and adjust priorities Easily

More information

Building a Real-Time Event-Driven Enterprise Infrastructure. Ann Moore Business Development Executive

Building a Real-Time Event-Driven Enterprise Infrastructure. Ann Moore Business Development Executive Building a Real-Time Event-Driven Enterprise Infrastructure Ann Moore Business Development Executive Agenda PI for Enterprise Infrastructure Utility Industry Use Cases Operational Data Non-Operational

More information

In Cloud, Can Scientific Communities Benefit from the Economies of Scale?

In Cloud, Can Scientific Communities Benefit from the Economies of Scale? PRELIMINARY VERSION IS PUBLISHED ON SC-MTAGS 09 WITH THE TITLE OF IN CLOUD, DO MTC OR HTC SERVICE PROVIDERS BENEFIT FROM THE ECONOMIES OF SCALE? 1 In Cloud, Can Scientific Communities Benefit from the

More information

Automated Service Builder

Automated Service Builder 1 Overview ASB is a platform and application agnostic solution for implementing complex processing chains over globally distributed processing and data ASB provides a low coding solution to develop a data

More information

GRID RESOURCE AVAILABILITY PREDICTION-BASED SCHEDULING AND TASK REPLICATION

GRID RESOURCE AVAILABILITY PREDICTION-BASED SCHEDULING AND TASK REPLICATION GRID RESOURCE AVAILABILITY PREDICTION-BASED SCHEDULING AND TASK REPLICATION BY BRENT ROOD BS, State University of New York at Binghamton, 25 MS, State University of New York at Binghamton, 27 DISSERTATION

More information

Accelerating Billing Infrastructure Deployment While Reducing Risk and Cost

Accelerating Billing Infrastructure Deployment While Reducing Risk and Cost An Oracle White Paper April 2013 Accelerating Billing Infrastructure Deployment While Reducing Risk and Cost Disclaimer The following is intended to outline our general product direction. It is intended

More information

Meeting the New Standard for AWS Managed Services

Meeting the New Standard for AWS Managed Services AWS Managed Services White Paper April 2017 www.sciencelogic.com info@sciencelogic.com Phone: +1.703.354.1010 Fax: +1.571.336.8000 Table of Contents Introduction...3 New Requirements in Version 3.1...3

More information

Computing efforts supporting Physics Analyses

Computing efforts supporting Physics Analyses Computing efforts supporting Physics Analyses reminder on the main aim of the Analysis Support Task Force survey of existing tools available to monitor/diagnose/communicate survey of how we use our GRID-like

More information

Testing SLURM batch system for a grid farm: functionalities, scalability, performance and how it works with Cream-CE

Testing SLURM batch system for a grid farm: functionalities, scalability, performance and how it works with Cream-CE Testing SLURM batch system for a grid farm: functionalities, scalability, performance and how it works with Cream-CE DONVITO GIACINTO (INFN) ZANGRANDO, LUIGI (INFN) SGARAVATTO, MASSIMO (INFN) REBATTO,

More information

ORACLE INFRASTRUCTURE AS A SERVICE PRIVATE CLOUD WITH CAPACITY ON DEMAND

ORACLE INFRASTRUCTURE AS A SERVICE PRIVATE CLOUD WITH CAPACITY ON DEMAND ORACLE INFRASTRUCTURE AS A SERVICE PRIVATE CLOUD WITH CAPACITY ON DEMAND FEATURES AND FACTS FEATURES Hardware and hardware support for a monthly fee Optionally acquire Exadata Storage Server Software and

More information

The Sumo Logic Solution: Application Management

The Sumo Logic Solution: Application Management The Sumo Logic Solution: Application Management Introduction The most critical and demanding responsibility facing CIOs, IT operations managers and system administrators on a daily basis is to keep their

More information

HA/DR Presentation. MRMUG meeting 6/5/2013

HA/DR Presentation. MRMUG meeting 6/5/2013 HA/DR Presentation MRMUG meeting 6/5/2013 Disaster Recovery as part of IT Business Continuity - Where it fits Business Continuity High Availability Fault-tolerant, failure-resistant infrastructure supporting

More information

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11 Top 5 Challenges for Hadoop MapReduce in the Enterprise Whitepaper - May 2011 http://platform.com/mapreduce 2 5/9/11 Table of Contents Introduction... 2 Current Market Conditions and Drivers. Customer

More information

WHITE PAPER. CA Nimsoft APIs. keys to effective service management. agility made possible

WHITE PAPER. CA Nimsoft APIs. keys to effective service management. agility made possible WHITE PAPER CA Nimsoft APIs keys to effective service management agility made possible table of contents Introduction 3 CA Nimsoft operational APIs 4 Data collection APIs and integration points Message

More information

Managing Microservices using the All-in-One TIBCO Monitor RTView Enterprise Monitor

Managing Microservices using the All-in-One TIBCO Monitor RTView Enterprise Monitor Managing Microservices using the All-in-One TIBCO Monitor RTView Enterprise Monitor Kalpana Kulanthaivelu, Wells Fargo Rodney Morrison, SL Wednesday, May 18 th, 2016 Microservices Are Picking Up Steam

More information

IaaS Cloud Benchmarking:

IaaS Cloud Benchmarking: IaaS Cloud Benchmarking: Approaches, Challenges, and Experience Alexandru Iosup Parallel and Distributed Systems Group Delft University of Technology The Netherlands Our team: Undergrad Nassos Antoniou,

More information

Building a Tableau Center of Excellence

Building a Tableau Center of Excellence # T C 1 8 Building a Tableau Center of Excellence Michael Cox Principal Architect Tableau Michael Cox Principal Architect Tableau Tableau Center of Excellence (COE) COE Overview COE Competencies ( Events

More information

Scheduling and Resource Management in Grids

Scheduling and Resource Management in Grids Scheduling and Resource Management in Grids ASCI Course A14: Advanced Grid Programming Models Ozan Sonmez and Dick Epema may 13, 2009 1 Outline Resource Management Introduction A framework for resource

More information

Machine Learning Based Prescriptive Analytics for Data Center Networks Hariharan Krishnaswamy DELL

Machine Learning Based Prescriptive Analytics for Data Center Networks Hariharan Krishnaswamy DELL Machine Learning Based Prescriptive Analytics for Data Center Networks Hariharan Krishnaswamy DELL Modern Data Center Characteristics Growth in scale and complexity Addition and removal of system components

More information

1 Entire contents 2007 Forrester Research, Inc. All rights reserved.

1 Entire contents 2007 Forrester Research, Inc. All rights reserved. 1 Entire contents 2007 Forrester Research, Inc. All rights reserved. ROI of Oracle Database Management Packs Noel Yuhanna Principal Analyst Forrester Research Theme All enterprises should focus on database

More information

Introduction to glite Middleware

Introduction to glite Middleware Introduction to glite Middleware Malik Ehsanullah (ehsan@barc.gov.in) BARC Mumbai 1 Introduction The Grid relies on advanced software, called middleware, which interfaces between resources and the applications

More information

Innovation Without Limits. Your Guide to High Performance Computing in the Cloud

Innovation Without Limits. Your Guide to High Performance Computing in the Cloud Innovation Without Limits Your Guide to High Performance Computing in the Cloud 4 5 6 7 8 10 12 What Could You Accomplish with a Million Cores? Access Resources Quickly Leverage Latest Technology Collaborate

More information

One System for Grid Operations Management

One System for Grid Operations Management One System for Grid Operations Management Spectrum Power ADMS Siemens AG 2014 All Right Reserved. usa.siemens.com/smartgrid Evolving grid challenges Increasing grid complexity including integration of

More information

Disaster Recovery Service Guide

Disaster Recovery Service Guide Disaster Recovery Service Guide Getting Started Overview of the HOSTING Unified Cloud The HOSTING Unified Cloud is our approach for helping you achieve better business outcomes. It combines the industry's

More information

A Matter ATLANTIS ERP ATLANTIS ERP ATLANTIS ERP s ATLANTIS ERP

A Matter ATLANTIS ERP ATLANTIS ERP ATLANTIS ERP s ATLANTIS ERP A Matter of Strategy In today s demanding and fast changing business environment, the installation of an IT system constitutes a matter of strategy. The upgrade to a state-of-the-art system is based on

More information