DOWNTIME IS NOT AN OPTION

Size: px
Start display at page:

Download "DOWNTIME IS NOT AN OPTION"

Transcription

1 DOWNTIME IS NOT AN OPTION HOW APACHE MESOS AND DC/OS KEEPS APPS RUNNING DESPITE FAILURES AND UPDATES 2017 Mesosphere, Inc. All Rights Reserved. 1

2 WAIT, WHO ARE YOU? Engineer at Mesosphere DC/OS 2016 Mesosphere, Inc. All Rights Reserved. 2

3 RUNNING NON-TRIVIAL APPLICATIONS AT SCALE IS *REALLY* COMPLICATED A non-trivial app might well need all the following: multiple app servers load balancing message queues HA datastores analytics Bring cluster computing to non-experts: One of the most exciting things about datacenter technology is that it is increasingly being applied to big data problems in the sciences. With cloud computing, scientists can readily acquire the hardware to run large parallel computations; the main thing missing is the right software. These non-expert cluster users have very different needs from those in large corporations: they are not backed by an operations team that will configure their systems and tune their programs. Instead, they need cluster software that configures itself correctly out of the box, rarely fails, and can be debugged without intimate knowledge of several interacting distributed systems. These are difficult but worthwhile challenges for the community to pursue. -The Datacenter Needs an Operating System 2016 Mesosphere, Inc. All Rights Reserved. 3

4 2017 Mesosphere, Inc. All Rights Reserved. 4

5 APACHE MESOS IS A CLUSTER RESOURCE MANAGER 2017 Mesosphere, Inc. All Rights Reserved. 5

6 2017 Mesosphere, Inc. All Rights Reserved. 6

7 DC/OS ENABLES MODERN DISTRIBUTED APPS Modern App Components Microservices (in containers) Big Data + Analytics Engines Streaming Batch Functions & Logic Analytics Machine Learning Search Time Series Databases SQL / NoSQL Datacenter Operating System (DC/OS) Distributed Systems Kernel (Mesos) Any Infrastructure (Physical, Virtual, Cloud) 7

8 APACHE MESOS THE KERNEL OF THE DC/OS 2016 Mesosphere, Inc. All Rights Reserved. 8

9 APACHE MESOS THE BIRTH OF MESOS TWITTER TECH TALK APACHE INCUBATION The grad students working on Mesos give a tech talk at Twitter. Spring 2009 Mesos enters the Apache Incubator. September 2010 March 2010 December 2010 CS262B MESOS PUBLISHED Ben Hindman, Andy Konwinski and Matei Zaharia create Nexus as their CS262B class project. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center is published as a technical report Mesosphere, Inc. All Rights Reserved. 9

10 APACHE MESOS GRAD STUDENTS LEARNED HOW TO SHARE Sharing resources between batch processing frameworks: Hadoop MPI Spark What does an operating system provide? Resource management Programming abstractions Security Monitoring, debugging, logging 2016 Mesosphere, Inc. All Rights Reserved. 10

11 APACHE MESOS CLUSTERING YOUR RESOURCES FOR YOU Apache Mesos is a cluster resource manager. It handles: Aggregating resources and offering them to schedulers Launching tasks (i.e. processes) on those resources Communicating the state of those tasks back to schedulers Tasks can be: Long running services Ephemeral / batch jobs 2016 Mesosphere, Inc. All Rights Reserved. 11

12 APACHE MESOS SCHEDULERS AND TASKS 2016 Mesosphere, Inc. All Rights Reserved. 12

13 APACHE MESOS A NAIVE APPROACH Spark/Hadoop Kafka mysql microservice Industry Average 12-15% utilization Cassandra Typical Datacenter siloed, over-provisioned servers, low utilization 2016 Mesosphere, Inc. All Rights Reserved. 13

14 APACHE MESOS MULTIPLEXING OF DATA, SERVICES, USERS, ENVIRONMENTS Spark/Hadoop Kafka mysql microservice Cassandra Typical Datacenter siloed, over-provisioned servers, low utilization Mesos/ DC/OS automated schedulers, workload multiplexing onto the same machines 2016 Mesosphere, Inc. All Rights Reserved. 14

15 FAILURE HANDLING FAILURE 2015 Mesosphere, Inc. All Rights Reserved. 15

16 FAILURE MESOS TASK FAILURE 2016 Mesosphere, Inc. All Rights Reserved. 16

17 MESOS TASK FAILURE ZK MARATHON CLIENT MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

18 MESOS TASK FAILURE ZK MARATHON MASTER Status Update Status Update CLIENT EXECUTOR SEGFAULT :( TASK 2015 Mesosphere, Inc. All Rights Reserved.

19 MESOS TASK FAILURE ZK MARATHON MASTER Launch Task Launch Task CLIENT Launch Task EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

20 MESOS TASK FAILURE ZK MARATHON MASTER Status Update Status Update CLIENT Status Update EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

21 FAILURE MESOS FAILURE 2016 Mesosphere, Inc. All Rights Reserved. 21

22 LOCAL FAILURE ZK MARATHON CLIENT MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

23 LOCAL FAILURE ZK MARATHON CLIENT MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

24 LOCAL FAILURE ZK MARATHON CLIENT MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

25 LOCAL FAILURE ZK MARATHON CLIENT MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

26 LOCAL FAILURE ZK MARATHON MASTER Re-register CLIENT EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

27 FAILURE MESOS HOST FAILURE 2016 Mesosphere, Inc. All Rights Reserved. 27

28 LOCAL FAILURE ZK MARATHON CLIENT MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

29 LOCAL FAILURE ZK MARATHON CLIENT MASTER 2015 Mesosphere, Inc. All Rights Reserved.

30 LOCAL FAILURE ZK MARATHON MASTER Status Update CLIENT 2015 Mesosphere, Inc. All Rights Reserved.

31 MESOS TASK FAILURE ZK Resource Offer MARATHON MASTER Launch Task CLIENT Launch Task Launch Task EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

32 MESOS TASK FAILURE ZK MARATHON MASTER Status Update Status Update CLIENT Status Update EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

33 FAILURE MESOS MASTER FAILURE 2016 Mesosphere, Inc. All Rights Reserved. 33

34 MASTER FAILURE ZK MARATHON CLIENT MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

35 MASTER FAILURE ZK MARATHON CLIENT MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

36 MASTER FAILURE ZK MARATHON CLIENT MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

37 MASTER FAILURE Leading Master ZK MARATHON Leading Master MASTER Leading Master CLIENT EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

38 MASTER FAILURE ZK MARATHON Reregister MASTER Reregister Reregister CLIENT Reregister EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

39 MASTER FAILURE ZK Reregistered MARATHON MASTER Reregistered Reregistered CLIENT Reregistered EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

40 FAILURE SCHEDULER FAILURE 2016 Mesosphere, Inc. All Rights Reserved. 40

41 SCHEDULER FAILURE ZK MARATHON CLIENT MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

42 SCHEDULER FAILURE ZK MARATHON CLIENT MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

43 SCHEDULER FAILURE ZK MARATHON CLIENT MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

44 SCHEDULER FAILURE Framework ID Leading Master ZK MARATHON CLIENT MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

45 SCHEDULER FAILURE ZK MARATHON CLIENT Reregister MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

46 SCHEDULER FAILURE ZK MARATHON CLIENT Reregistered MASTER EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

47 SCHEDULER FAILURE ZK Reconcile Tasks MARATHON MASTER Status Update CLIENT EXECUTOR TASK 2015 Mesosphere, Inc. All Rights Reserved.

48 ORCHESTRATION CONTAINER ORCHESTRATION 2015 Mesosphere, Inc. All Rights Reserved. 48

49 CONTAINER ORCHESTRATION CONTAINER SCHEDULING RESOURCE MANAGEMENT SERVICE MANAGEMENT - Load Balancing Readiness Checking 2017 Mesosphere, Inc. All Rights Reserved. 49

50 CONTAINER ORCHESTRATION CONTAINER SCHEDULING - Placement Replication/Scaling Resurrection Rescheduling Rolling Deployment Upgrades Downgrades Collocation RESOURCE MANAGEMENT - Memory CPU GPU Volumes Ports IPs Images/Artifacts SERVICE MANAGEMENT - Labels Groups/Namespaces Dependencies Load Balancing Readiness Checking 2017 Mesosphere, Inc. All Rights Reserved. 50

51 The SMACK Stack EVENTS INGEST Ubiquitous data streams from connected devices Ingest millions of events per second ANALYZE Real-time and batch process data STORE ACT Distributed & highly scalable database Visualize data and build data driven applications Sensors Devices Clients Apache Kafka Apache Spark Apache Cassandra Akka Mesos/ DC/OS 2016 Mesosphere, Inc. All Rights Reserved. 51

52 METRICS MONITORING 2016 Mesosphere, Inc. All Rights Reserved. 52

53 METRICS Measurements captured to determine health and performance of cluster - How utilized is the cluster? Are resources being optimally used? Is the system performing better or worse over time? Are there bottlenecks in the system? What is the response time of applications? 2016 Mesosphere, Inc. All Rights Reserved.

54 DC/OS METRIC SOURCES Mesos metrics Resource, frameworks, masters, agents, tasks, system, events Container Metrics CPU, mem, disk, network Application Metrics QPS, latency, response time, hits, active users, errors App App App Container Container Container Mesos OS 2016 Mesosphere, Inc. All Rights Reserved.

55 MESOS MASTER METRICS Metrics for the master node are available at the following URL: The response is a JSON object that contains metrics names and values as key-value pairs. Metric Groups: Resources Master System Slaves Frameworks Tasks Messages Event Queue Registrar 2016 Mesosphere, Inc. All Rights Reserved.

56 MESOS MASTER BASIC ALERTS Metric Value Inference master/uptime_secs is low The master has restarted master/uptime_secs < 60 for sustained periods of time The cluster has a flapping master node master/tasks_lost is increasing rapidly Tasks in the cluster are disappearing. Possible causes include hardware failures, bugs in one of the frameworks or bugs in Mesos master/slaves_active is low Slaves are having trouble connecting to the master master/cpus_percent > 0.9 for sustained periods of time DCOS Cluster CPU utilization is close to capacity master/mem_percent > 0.9 for sustained periods of time DCOS Cluster Memory utilization is close to capacity master/disk_used & master/disk_percent DCOS Disk space consumed by Reservations master/elected is 0 for sustained periods of time No Master is currently elected 2016 Mesosphere, Inc. All Rights Reserved.

57 DEMO? 2016 Mesosphere, Inc. All Rights Reserved. 57

58 2016 Mesosphere, Inc. All Rights Reserved. 58