Cluster management at Google

Similar documents
Omega: flexible, scalable schedulers for large compute clusters. Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, John Wilkes

Heterogeneity and Dynamicity of Clouds at Scale: Google Trace Analysis

Hawk: Hybrid Datacenter Scheduling

Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization

Resource Scheduling Architectural Evolution at Scale and Distributed Scheduler Load Simulator

PICS: A Public IaaS Cloud Simulator. In Kee Kim, Wei Wang, and Marty Humphrey Department of Computer Science University of Virginia

Apache Mesos. Delivering mixed batch & real-time data infrastructure , Galway, Ireland

Gandiva: Introspective Cluster Scheduling for Deep Learning

Mesos and Borg (Lecture 17, cs262a) Ion Stoica, UC Berkeley October 24, 2016

On Cloud Computational Models and the Heterogeneity Challenge

Reducing Execution Waste in Priority Scheduling: a Hybrid Approach

Exploring Non-Homogeneity and Dynamicity of High Scale Cloud through Hive and Pig

Bluemix Overview. Last Updated: October 10th, 2017

Enabling Self-Service Analytics Across The UDA With Teradata AppCenter

Live Video Analytics at Scale with Approximation and Delay-Tolerance

5th Annual. Cloudera, Inc. All rights reserved.

Head of Enterprise Sales TW Stan Yao

Building a Multi-Tenant Infrastructure for Diverse Application Workloads

SIMPLIFY IT OPERATIONS WITH ARTIFICIAL INTELLIGENCE

Stateful Services on DC/OS. Santa Clara, California April 23th 25th, 2018

More information for FREE VS ENTERPRISE LICENCE :

Real-time Streaming Applications on AWS Patterns and Use Cases

Optimal Infrastructure for Big Data

From Microservices to Teraservices. Adrian Technology Fellow - Battery Ventures September 2015

DOWNTIME IS NOT AN OPTION

MICROSERVICES. Prabavathy Arumugam Software AG. All rights reserved. For internal use only

Data Center Operating System (DCOS) IBM Platform Solutions

Enterprise-Scale MATLAB Applications

Multi-Resource Fair Sharing for Datacenter Jobs with Placement Constraints

Triage: Balancing Energy and Quality of Service in a Microserver

Dependency-aware and Resourceefficient Scheduling for Heterogeneous Jobs in Clouds

Top six performance challenges in managing microservices in a hybrid cloud

Getting Started with vrealize Operations First Published On: Last Updated On:

Intro to Big Data and Hadoop

Data Analytics and CERN IT Hadoop Service. CERN openlab Technical Workshop CERN, December 2016 Luca Canali, IT-DB

Digital Transformation & olvency II Simulations for L&G: Optimizing, Accelerating and Migrating to the Cloud

Sr. Sergio Rodríguez de Guzmán CTO PUE

Tetris: Optimizing Cloud Resource Usage Unbalance with Elastic VM

White paper A Reference Model for High Performance Data Analytics(HPDA) using an HPC infrastructure

Morpheus: Towards Automated SLOs for Enterprise Clusters

Starting Workflow Tasks Before They re Ready

GUIDE The Enterprise Buyer s Guide to Public Cloud Computing

Extending on-premise HPC to the cloud

InfoSphere DataStage Grid Solution

CONTAINERS: DON'T SKEU THEM UP

E-guide Hadoop Big Data Platforms Buyer s Guide part 1

Application Migration Patterns for the Service Oriented Cloud

INTER CA NOVEMBER 2018

Self-driving Clouds: From Vision to Reality

Deliver a Private Cloud Middleware Platform or Public Cloud Platform as a Service

Cloudera, Inc. All rights reserved.

Bigger, Longer, Fewer: what do cluster jobs look like outside Google?

Grid 2.0 : Entering the new age of Grid in Financial Services

In Cloud, Can Scientific Communities Benefit from the Economies of Scale?

Social Media Analytics Using Greenplum s Data Computing Appliance

How In-Memory Computing can Maximize the Performance of Modern Payments

Investor Presentation. Fourth Quarter 2015

MS Integrating On-Premises Core Infrastructure with Microsoft Azure

Meeting the New Standard for AWS Managed Services

Kubernetes for the enterprise

The ABCs of. CA Workload Automation

Addressing the I/O bottleneck of HPC workloads. Professor Mark Parsons NEXTGenIO Project Chairman Director, EPCC

Scaling up MATLAB Analytics with Kafka and Cloud Services

Guagua: An Iterative Computing Framework on Hadoop. Zhang Pengshan(David), PayPal

Building a Data Lake on AWS EBOOK: BUILDING A DATA LAKE ON AWS 1

Uncovering the Hidden Truth In Log Data with vcenter Insight

Flink meet DC/OS. Deploying Apache Flink at Scale. Elizabeth K. Ravi FlinkForward San Francisco

FULTECH CONSULTING RISK TECHNOLOGIES

2015 The MathWorks, Inc. 1

Scaling up MATLAB Analytics with Kafka and Cloud Services

Investor Presentation. Second Quarter 2016

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11

Secure High-Performance SOA Management with Intel SOA Expressway

CIS : Scalable Data Analysis

Exploiting Big Data in Engineering Adaptive Cloud Services. Patrick Martin Database Systems Laboratory School of Computing, Queen s University

Multi-Stage Resource-Aware Scheduling for Data Centers with Heterogenous Servers

Timely Long Tail Identification Through Agent Based Monitoring and Analytics

alsched: Algebraic Scheduling of Mixed Workloads in Heterogeneous Clouds

What s new in Machine Learning across the Splunk Portfolio

Learning Based Admission Control. Jaideep Dhok MS by Research (CSE) Search and Information Extraction Lab IIIT Hyderabad

The IoT Solutions Space: Edge-Computing IoT architecture, the FAR EDGE Project John Professor Athens Information

Integrating MATLAB Analytics into Enterprise Applications The MathWorks, Inc. 1

Machine Learning For Enterprise: Beyond Open Source. April Jean-François Puget

What Do You Need to Ensure a Successful Transition to IoT?

Deep Learning Acceleration with MATRIX: A Technical White Paper

BACSOFT IOT PLATFORM: A COMPLETE SOLUTION FOR ADVANCED IOT AND M2M APPLICATIONS

Enterprise Analytics Accelerating Your Path to Value with an Open Analytics Platform

Ampliando MATLAB Analytics con Kafka y Servicios en la Nube

Technology Reply MISSION RICERCA & SVILUPPO CERTIFICATIONS AND SPECIALIZATIONS EXCELLENCE AWARD WINNER

Solution Architecture Training: Enterprise Integration Patterns and Solutions for Architects

20775A: Performing Data Engineering on Microsoft HD Insight

Features and Capabilities. Assess.

CLOUD computing and its pay-as-you-go cost structure

An Analytics System for Optimizing Manufacturing Processes

Five Interview Phases To Identify The Best Software Developers

Data-Powered Clouds: Challenges for Data Management, Cloud Computing, and Software

Virtualizing Big Data/Hadoop Workloads. Update for vsphere 6. Justin Murray VMware VMware Inc. All rights reserved.

CLOUD READINESS QUESTIONNAIRE

Big Data Hadoop Administrator.

PICS: A Public IaaS Cloud Simulator

Transcription:

Cluster management at Google LISA 2013 john wilkes (johnwilkes@google.com) Software Engineer, Google, Inc.

We own and operate data centers around the world http://www.google.com/about/datacenters/inside/locations/

Not a data center

Data center

Cluster management: what is it? A cluster is managed as 1 or more cells Cell agent manager manager manager

Cluster management: what is it? A cell runs jobs, made up of (1 to thousands) of tasks, for internal users Cell agent task manager manager manager

workload: job runtimes CDF Batch Service Job runtime [log]

workload: job inter-arrival times CDF Batch Service Service Batch Job inter-arrival time [log]

workload: batch/service split Cluster A Medium size Medium utilization Cluster B Large size Medium utilization Jobs/tasks: counts CPU/RAM: resource seconds [i.e. resource job runtime in sec.] Cluster C Medium (12k mach.) High utilization Public trace

Cluster management: what is it? Placing tasks onto machines is just a knapsack problem, with constraints CPU RAM

Cluster management: what is it? Heterogeneity and dynamicity of clouds at scale: Google trace analysis. SoCC 12

public cell workload trace Want to try it yourself? search for [john wilkes google cluster data] 29 day cell workload trace from May 2011 ~12.5k machines every job/task/machine event task usage every 5 mins (obfuscated data) CPU RAM

a few other moving parts system config security accounting/planning storage job config master agent GWS monitoring binaries + data distribution Diagram from an original by Cody Smith.

configuration Make an app work right once: simple Google Docs uses ~50 systems and services Make it run in production: priceless run it in 6 cells fix it in an emergency move it to another cell http://melinathinks.com

If you aren't measuring it, it is out of control...

SLI/SLOs for cluster manager 1. Availability % time master is up max outage duration 2. Performance RPC request latencies Time to schedule+start job 3. Evictions % jobs with too high an eviction rate % jobs with too many concurrent evictions % tasks evicted without proper notice

Cluster management: pre-omega The current system was built 2003-4. It works pretty well But... growing scale (#cores) inflexibility (adding features) internal complexity (adding people)

Cluster management: pre-omega RPCs Brian Grant, Oct 2013 Protobuf fields

The second system Main user goals: predictability & ease of use Main team goal: flexibility Omega is currently being deployed + prototyped

overall architecture CellState: minimum necessary data no policies just enforces invariants Omlets Calendar: everything is a scheduling event!

performance: simulation study good = down & blue UCB Mesos one path fast batch path Monolithic scheduler Omega: flexible, scalable schedulers for large compute clusters. EuroSys 2013 Omega

flexibility: MapReduce scheduler 200 11 1000 5 Jobs 3 50 100 450 8000 Workers per job [log scale] Omega: flexible, scalable schedulers for large compute clusters. EuroSys 2013

flexibility: MapReduce scheduler CDF 60% of MapReduce jobs better 3-4x speedup! Relative speedup [log10] Omega: flexible, scalable schedulers for large compute clusters. EuroSys 2013

multiple applications per machine tasks/machine threads/machine CPI2: CPU performance isolation for shared compute clusters. EuroSys 2013

multiple applications per machine μ μ+σ μ + 2σ μ + 3σ task CPI CPI data for all the tasks in a job per-cluster per-platform mean (µ) stddev (σ) outliers => victims CPI2: CPU performance isolation for shared compute clusters. EuroSys 2013

Omega goals: ease of use Workload variation CPU and RAM usage as a function of offered load load

Omega goals: ease of use Here s a binary run it Why didn t it run?

Open issues: smart explanations It s 3am and your pager goes off - are we in trouble? - are we about to get into trouble? what should you do about it?

summary Large-scale systems are exciting beasts scale failures QoS new designs configuration may be the next big challenge We re hiring!