Hawk: Hybrid Datacenter Scheduling

Size: px

Start display at page:

Download "Hawk: Hybrid Datacenter Scheduling"

Ross Carpenter
6 years ago
Views:

1 Hawk: Hybrid Datacenter Scheduling Pamela Delgado, Florin Dinu, Anne-Marie Kermarrec, Willy Zwaenepoel July 10th, 2015 USENIX ATC

2 Introduction: datacenter scheduling Job 1 task task scheduler cluster Job N task task 2

3 Introduction: centralized scheduling Job 1 task task centralized scheduler cluster Job N task task 3

4 Introduction: centralized scheduling cluster Job N Job 2 Job 1 centralized scheduler 4

5 Introduction: centralized scheduling cluster Job N Job 2 Job 1 Good: placement centralized scheduler Not so good: scheduling latency 5

6 Introduction: distributed scheduling Job 1 Job 2 distributed scheduler 1 distributed scheduler 2 cluster Job N distributed scheduler N 6

7 Introduction: distributed scheduling Job 1 distributed scheduler 1 cluster Job 2 distributed scheduler 2 Good: scheduling latency Not so good: placement Job N distributed scheduler N 7

8 Outline 1) Introduction 2) HAWK hybrid scheduling Rationale Design 3) Evaluation Simulation Real cluster implementation 4) Conclusion 8

9 Hybrid scheduling cluster Job 1 Job M Job 2 centralized scheduler distributed scheduler 1 Job N distributed scheduler N 9

10 Hawk: hybrid scheduling Long jobs centralized Short jobs distributed 10

11 Hawk: hybrid scheduling Long job 1 Long job M centralized scheduler Long/short: estimated execution time vs cut-off Short job 1 Short job 2 Short job N distributed scheduler 1 distributed scheduler 2 distributed scheduler N 11

12 Rationale for Hawk Typical production workloads few Long job 1 Long job M most resources many Short job 1 Short job 2 little resources Short job N 12

13 Rationale for Hawk (continued) Percentage of long jobs Percentage of task-seconds for long jobs Source: Design Insights for MapReduce from Diverse Production Workloads, Chen et al 2012

14 Rationale for Hawk (continued) Percentage of long jobs Percentage of task-seconds for long jobs Long jobs: minority 60 but take up most of 40the resources Source: Design Insights for MapReduce from Diverse Production Workloads, Chen et al 2012

15 Hawk: hybrid scheduling Bulk of resources good placement Long job 1 centralized Few jobs reasonable scheduling latency Short job 1 distributed 1 Latency sensitive Fast scheduling Short job N distributed N Few resources can trade not-so-good placement 15

distributed 1 distributed N Few jobs reasonable scheduling latency Good: scheduling

16 Hawk: hybrid scheduling Bulk of resources good placement Latency sensitive Fast scheduling Long job 1 centralized BEST OF BOTH WORLDS Short job 1 Short job N distributed 1 distributed N Few jobs reasonable scheduling latency Good: scheduling latency for short jobs Good: placement for long jobs Few resources can trade not-so-good placement 16

17 Hawk: distributed scheduling Sparrow Work-stealing 17

18 Hawk: distributed scheduling Sparrow Work-stealing 18

19 Sparrow distributed scheduler task random reservation (power of two) 19

20 Hawk: distributed scheduling Sparrow Work-stealing 20

21 Sparrow and high load distributed scheduler task Random placement: Low likelihood on finding a free node 21

22 Sparrow and high load distributed scheduler task High load + job heterogeneity head-of-line blocking Random placement: Low likelihood on finding a free node 22

23 Hawk work-stealing Free node!! 23

24 Hawk work-stealing 2. Random node: send short tasks reservation in queue 1. Free node: contact random node for probes! 24

25 Hawk work-stealing 2. Random node: send short tasks reservation in queue High load high probability of contacting node with backlog 1. Free node: contact random node for probes! 25

26 Hawk cluster partitioning centralized scheduler No coordination, challenge: no free nodes for mice! Reserved nodes: small cluster partition distributed scheduler 26

27 Hawk cluster partitioning centralized scheduler No coordination, challenge: no free nodes for mice! Short jobs schedule anywhere. Long jobs only in non-reserved nodes. Reserved nodes: small cluster partition distributed scheduler 27

28 Hawk design summary Hybrid scheduler: long centralized, short distributed Work-stealing Cluster partitioning 28

29 Evaluation: 1. Simulation Sparrow simulator Google trace Vary number of nodes to vary cluster utilization Measure: Job running time Report 50 th and 90 th percentiles for short and long jobs Normalized to Sparrow 29

30 Hawk/Sparrow Simulated results: short jobs lower better th 90th Better across the board Number of nodes in the cluster (thousands) 30

31 Hawk/Sparrow Simulated results: long jobs lower better th 90th Better except under high load Number of nodes in the cluster (thousands) 31

32 Hawk/Sparrow Simulated results: long jobs lower better th 90th Very high utilization: partitioning Number of nodes in the cluster (thousands) 32

33 Decomposing Hawk 1. Hawk minus centralized 2. Hawk minus stealing 3. Hawk minus partitioning (normalized to Hawk) 33

34 Decomposing Hawk: no centralized 1. Hawk minus centralized 2 2. Hawk minus stealing Hawk minus partitioning (normalized to Hawk) Hawk no centralized 50th short jobs 90th short jobs 50th long jobs 90th long jobs 34

35 Decomposing Hawk: no stealing 1. Hawk minus centralized Hawk minus stealing Hawk minus partitioning (normalized to Hawk) Hawk no stealing 50th short jobs 90th short jobs 50th long jobs 90th long jobs 35

36 Decomposing Hawk: no partitioning 1. Hawk minus centralized Hawk minus stealing Hawk minus partitioning (normalized to Hawk) Hawk no partition 50th short jobs 90th short jobs 50th long jobs 90th long jobs 36

37 Decomposing Hawk summary Absence of any component reduces Hawk s performance! Hawk no centralized Hawk no stealing Hawk no partition 50th short jobs 90th short jobs 50th long jobs 90th long jobs 37

38 Sensitivity analysis 1. Incorrect estimates of runtime 2. Cut off long/short 3. Details of stealing 38

39 Sensitivity analysis 1. Incorrect estimates of runtime 2. Cut off long/short 3. Details of stealing Bottom line: benefits of Hawk remain despite variation See paper for details 39

40 Evaluation: 2. Implementation Hawk scheduler Hawk daemon Hawk daemon 40

41 Experiment 100-node cluster Subset of Google trace Vary inter-arrival time to vary cluster utilization Measure: Job running time Report 50 th and 90 th percentile for short and long jobs Normalized to Sparrow 41

42 Hawk/Sparrow Hawk/Sparrow Short jobs lower better Inter-arrival time Inter-arrival time real 50th simulated 50th real 90th simulated 90th 42

43 Hawk/Sparrow Hawk/Sparrow Long jobs lower better Inter-arrival time Inter-arrival time real 50th simulated 50th real 90th simulated 90th 43

44 Implementation 1. Hawk works well in real cluster 2. Good correspondence implementation/simulation 44

45 Related work Centralized: Hadoop Fair Scheduler, Quincy Eurosys 10, SOSP 09 Two level: Yarn, Mesos SoCC 12, NSDI 11 Distributed schedulers: Omega, Sparrow Eurosys 12,SOSP 13 Hybrid schedulers: Mercury 45

46 Conclusion Hawk: hybrid scheduler long: centralized, short: distributed work-stealing cluster partitioning Hawk provides good results for short and long jobs Even under high cluster utilization 46

Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization

Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization Wei Chen, Jia Rao*, and Xiaobo Zhou University of Colorado, Colorado Springs * University of Texas at Arlington Data Center