Lecture 11: CPU Scheduling

Similar documents
Example. You manage a web site, that suddenly becomes wildly popular. Performance starts to degrade. Do you?

SE350: Operating Systems. Lecture 6: Scheduling

CPU scheduling. CPU Scheduling

Reading Reference: Textbook: Chapter 7. UNIX PROCESS SCHEDULING Tanzir Ahmed CSCE 313 Fall 2018

CPU Scheduling CPU. Basic Concepts. Basic Concepts. CPU Scheduler. Histogram of CPU-burst Times. Alternating Sequence of CPU and I/O Bursts

CSC 1600: Chapter 5. CPU Scheduling. Review of Process States

Chapter 6: CPU Scheduling. Basic Concepts. Histogram of CPU-burst Times. CPU Scheduler. Dispatcher. Alternating Sequence of CPU And I/O Bursts

CPU Scheduling (Chapters 7-11)

CPU SCHEDULING. Scheduling Objectives. Outline. Basic Concepts. Enforcement of fairness in allocating resources to processes

CS 143A - Principles of Operating Systems

א א א א א א א א

Scheduling Processes 11/6/16. Processes (refresher) Scheduling Processes The OS has to decide: Scheduler. Scheduling Policies

CPU Scheduling Minsoo Ryu Real-Time Computing and Communications Lab. Hanyang University

Scheduling Algorithms. Jay Kothari CS 370: Operating Systems July 9, 2008

Principles of Operating Systems

CPU Scheduling. Basic Concepts Scheduling Criteria Scheduling Algorithms. Unix Scheduler

CPU Scheduling. Chapter 9

Roadmap. Tevfik Ko!ar. CSC Operating Systems Spring Lecture - V CPU Scheduling - I. Louisiana State University.

Roadmap. Tevfik Koşar. CSE 421/521 - Operating Systems Fall Lecture - V CPU Scheduling - I. University at Buffalo.

CPU Scheduling: Part I. Operating Systems. Spring CS5212

Scheduling. CSE Computer Systems November 19, 2001

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

CS 153 Design of Operating Systems Winter 2016

CPU Scheduling. Jo, Heeseung

Comp 204: Computer Systems and Their Implementation. Lecture 10: Process Scheduling

CSE 5343/7343 Fall 2006 PROCESS SCHEDULING

FIFO SJF STCF RR. Operating Systems. Minati De. Department of Mathematics, Indian Institute of Technology Delhi, India. Lecture 6: Scheduling

Lecture 3. Questions? Friday, January 14 CS 470 Operating Systems - Lecture 3 1

CPU Scheduling. Disclaimer: some slides are adopted from book authors and Dr. Kulkarni s slides with permission

Motivation. Types of Scheduling

CPU Scheduling. Jo, Heeseung

September 30 th, 2015 Prof. John Kubiatowicz

Recall: Example of RR with Time Quantum = 20. Round-Robin Discussion

CS510 Operating System Foundations. Jonathan Walpole

Introduction to Operating Systems Prof. Chester Rebeiro Department of Computer Science and Engineering Indian Institute of Technology, Madras

Project 2 solution code

Recall: FCFS Scheduling (Cont.) Example continued: Suppose that processes arrive in order: P 2, P 3, P 1 Now, the Gantt chart for the schedule is:

Scheduling I. Today. Next Time. ! Introduction to scheduling! Classical algorithms. ! Advanced topics on scheduling

Intro to O/S Scheduling. Intro to O/S Scheduling (continued)

CPU Scheduling. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

CPU Scheduling. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CPU Scheduling. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

CS 111. Operating Systems Peter Reiher

CSC 553 Operating Systems

Salisu Aliyu Department of Mathematics, Ahmadu Bello University, Zaria, Nigeria

Introduction to Operating Systems. Process Scheduling. John Franco. Dept. of Electrical Engineering and Computing Systems University of Cincinnati

CSE 451: Operating Systems Spring Module 8 Scheduling

Operating Systems. Scheduling

OPERATING SYSTEMS. Systems and Models. CS 3502 Spring Chapter 03

Announcements. Program #1. Reading. Is on the web Additional info on elf file format is on the web. Chapter 6. CMSC 412 S02 (lect 5)

CS162 Operating Systems and Systems Programming Lecture 10. Tips for Handling Group Projects Thread Scheduling

Journal of Global Research in Computer Science

CPU scheduling CPU 1. P k P 3 P 2 P 1 CPU 2. CPU n. The scheduling problem: Which jobs should we assign to which CPU(s)?

Uniprocessor Scheduling

Process Scheduling I. COMS W4118 Prof. Kaustubh R. Joshi hdp://

Simulation of Process Scheduling Algorithms

Advanced Types Of Scheduling

Chapter 9 Uniprocessor Scheduling

CS Advanced Operating Systems Structures and Implementation Lecture 10. Scheduling (Con t) Real-Time Scheduling.

Lecture 6: CPU Scheduling. CMPUT 379, Section A1, Winter 2014 February 5

IJCSC VOLUME 5 NUMBER 2 JULY-SEPT 2014 PP ISSN

LOAD SHARING IN HETEROGENEOUS DISTRIBUTED SYSTEMS

ELC 4438: Embedded System Design Real-Time Scheduling

Operating System 9 UNIPROCESSOR SCHEDULING

A Paper on Modified Round Robin Algorithm

Lecture Note #4: Task Scheduling (1) EECS 571 Principles of Real-Time Embedded Systems. Kang G. Shin EECS Department University of Michigan

Scheduling II. To do. q Proportional-share scheduling q Multilevel-feedback queue q Multiprocessor scheduling q Next Time: Memory management

Motivating Examples of the Power of Analytical Modeling

Process Scheduling Course Notes in Operating Systems 1 (OPESYS1) Justin David Pineda

TABLE OF CONTENTS CHAPTER NO. TITLE PAGE NO. ABSTRACT LIST OF TABLES LIST OF FIGURES LIST OF SYMBOLS AND ABBREVIATIONS

Review of Round Robin (RR) CPU Scheduling Algorithm on Varying Time Quantum

GENERALIZED TASK SCHEDULER

Lab: Response Time Analysis using FpsCalc Course: Real-Time Systems Period: Autumn 2015

ENGG4420 CHAPTER 4 LECTURE 3 GENERALIZED TASK SCHEDULER

TCSS 422: OPERATING SYSTEMS

Job Scheduling in Cluster Computing: A Student Project

Asia Pacific Journal of Engineering Science and Technology

Improvement of Queue Management for Real Time Task in Operating System

Homework 2: Comparing Scheduling Algorithms

Pallab Banerjee, Probal Banerjee, Shweta Sonali Dhal

Improving Throughput and Utilization in Parallel Machines Through Concurrent Gang

Operating Systems Process Scheduling Prof. Dr. Armin Lehmann

Advanced Operating Systems (CS 202) Scheduling (2)

A COMPARATIVE ANALYSIS OF SCHEDULING ALGORITHMS

Tasks or processes attempt to control or react to external events occuring in real time RT processes must be able to keep up events

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

Clock-Driven Scheduling

Periodic task scheduling

An operating system executes a variety of programs: Batch system - jobs Time-shared systems - user programs or tasks

10/1/2013 BOINC. Volunteer Computing - Scheduling in BOINC 5 BOINC. Challenges of Volunteer Computing. BOINC Challenge: Resource availability

DEADLINE MONOTONIC ALGORITHM (DMA)

Bhavin Fataniya 1, Manoj Patel 2 1 M.E(I.T) Student, I.T Department, L.D College Of Engineering, Ahmedabad, Gujarat, India ABSTRACT I.

Scheduling: Introduction

Efficient Round Robin Scheduling Algorithm with Dynamic Time Slice

SELF OPTIMIZING KERNEL WITH HYBRID SCHEDULING ALGORITHM

Computer Sciences Department

Lecture 11: Scheduling

LEAST-MEAN DIFFERENCE ROUND ROBIN (LMDRR) CPU SCHEDULING ALGORITHM

Comparative Analysis of Basic CPU Scheduling Algorithms

CS3211 Project 2 OthelloX

Transcription:

CS 422/522 Design & Implementation of Operating Systems Lecture 11: CPU Scheduling Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions of the CS422/522 lectures taught by Prof. Bryan Ford and Dr. David Wolinsky, and also from the official set of slides accompanying the OSPP textbook by Anderson and Dahlin. CPU scheduler Selects from among the processes in memory that are ready to execute, and allocates the CPU to one of them. CPU scheduling decisions may take place when a process: 1. switches from running to waiting state. 2. switches from running to ready state. 3. switches from waiting to ready. 4. terminates. Scheduling under 1 and 4 is nonpreemptive. All other scheduling is preemptive. 1

Main points Scheduling policy: what to do next, when there are multiple threads ready to run Or multiple packets to send, or web requests to serve, or Definitions response time, throughput, predictability Uniprocessor policies FIFO, round robin, optimal multilevel feedback as approximation of optimal Multiprocessor policies Affinity scheduling, gang scheduling Queueing theory Can you predict/improve a system s response time? Example You manage a web site, that suddenly becomes wildly popular. Do you? Buy more hardware? Implement a different scheduling policy? Turn away some users? Which ones? How much worse will performance get if the web site becomes even more popular? 2

Definitions Task/Job User request: e.g., mouse click, web request, shell command, Latency/response time How long does a task take to complete? Throughput How many tasks can be done per unit of time? Overhead How much extra work is done by the scheduler? Fairness How equal is the performance received by different users? Predictability How consistent is the performance over time? More definitions Workload Set of tasks for system to perform Preemptive scheduler If we can take resources away from a running task Work-conserving Resource is used whenever there is a task to run For non-preemptive schedulers, work-conserving is not always better Scheduling algorithm takes a workload as input decides which tasks to do first Performance metric (throughput, latency) as output Only preemptive, work-conserving schedulers to be considered 3

Scheduling policy goals minimize response time : elapsed time to do an operation (or job) Response time is what the user sees: elapsed time to * echo a keystroke in editor * compile a program * run a large scientific problem maximize throughput : operations (jobs) per second two parts to maximizing throughput * minimize overhead (for example, context switching) * efficient use of system resources (not only CPU, but disk, memory, etc.) fair : share CPU among users in some equitable way First In First Out (FIFO) Schedule tasks in the order they arrive Continue running them until they complete or give up the processor Example: memcached Facebook cache of friend lists, On what workloads is FIFO particularly bad? 4

FIFO scheduling Example: Process Burst Time P 1 24 P 2 3 P 3 3 Suppose that the processes arrive in the order: P 1, P 2, P 3 The Gantt Chart for the schedule is: P 1 " P 2 " P 3 " 0" 24" 27" 30" Waiting time for P 1 = 0; P 2 = 24; P 3 = 27 Average waiting time: (0 + 24 + 27)/3 = 17 FIFO scheduling (cont d) Suppose that the processes arrive in the order P 2, P 3, P 1. The Gantt chart for the schedule is: P 2 " P 3 " P 1 " 0" 3" 6" 30" Waiting time for P 1 = 6; P 2 = 0 ; P 3 = 3 Average waiting time: (6 + 0 + 3)/3 = 3 Much better than previous case. FIFO Pros: simple; Cons: short jobs get stuck behind long jobs 5

Shortest-Job-First (SJF) scheduling Associate with each process the length of its next CPU burst. Use these lengths to schedule the process with the shortest time. Two schemes: nonpreemptive once given CPU it cannot be preempted until completes its CPU burst. preemptive if a new process arrives with CPU burst length less than remaining time of current executing process, preempt. A.k.a. Shortest-Remaining-Time-First (SRTF). SJF is optimal but unfair pros: gives minimum average response time cons: long-running jobs may starve if too many short jobs; difficult to implement (how do you know how long it will take) Example of non-preemptive SJF Process Arrival Time Burst Time P 1 0.0 7 P 2 2.0 4 P 3 4.0 1 P 4 5.0 4 SJF (non-preemptive) P 1 " P 3 " P 2 " P 4 " 0" 3" 7" 8" 12" 16" Average waiting time = (0 + 6 + 3 + 7)/4 = 4 6

Example of preemptive SJF Process Arrival Time Burst Time P 1 0.0 7 P 2 2.0 4 P 3 4.0 1 P 4 5.0 4 SJF (preemptive) P 1 " P 2 " P 3 " P 2 " P 4 " P 1 " 0" 2" 4" 5" 7" 11" Average waiting time = (9 + 1 + 0 +2)/4 = 3 16" FIFO vs. SJF Tasks FIFO (1) (2) (3) (4) (5) Tasks SJF (1) (2) (3) (4) (5) Time 7

Starvation and sample bias Suppose you want to compare two scheduling algorithms Create some infinite sequence of arriving tasks Start measuring Stop at some point Compute average response time as the average for completed tasks between start and stop Is this valid or invalid? Sample bias solutions Measure for long enough that # of completed tasks >> # of uncompleted tasks For both systems! Start and stop system in idle periods Idle period: no work to do If algorithms are work-conserving, both will complete the same tasks 8

Round Robin (RR) Each process gets a small unit of CPU time (time quantum). After time slice, it is moved to the end of the ready queue. Time Quantum = 10-100 milliseconds on most OS n processes in the ready queue; time quantum is q each process gets 1/n of the CPU time in q time units at once. no process waits more than (n-1)q time units. each job gets equal shot at the CPU Performance q large FIFO q too small throughput suffers. Spend all your time context switching, not getting any real work done Round Robin Tasks (1) Round Robin (1 ms time slice) Rest of Task 1 (2) (3) (4) (5) Tasks (1) Round Robin (100 ms time slice) Rest of Task 1 (2) (3) (4) (5) Time 9

Example: RR with time quantum = 20 The Gantt chart is: Process Burst Time P 1 53 P 2 17 P 3 68 P 4 24 P 1 " P 2" P 3" P 4" P 1" P 3" P 4" P 1" P 3" P 3" 0" 20" 37" 57" 77" 97" 117" 121" 134" 154" 162" Typically, higher average turnaround than SJF, but better response. RR vs. FIFO Assuming zero-cost time slice, is RR always better than FIFO? 10 jobs, each take 100 secs, RR time slice 1 sec what would be the average response time under RR and FIFO? RR job1: 991s, job2: 992s,..., job10: 1000s FIFO job 1: 100s, job2: 200s,..., job10: 1000s Comparisons RR is much worse for jobs about the same length RR is better for short jobs 10

RR vs. FIFO (cont d) Tasks Round Robin (1 ms time slice) (1) (2) (3) (4) (5) Tasks FIFO and SJF (1) (2) (3) (4) (5) Time Mixed workload Tasks I/O Bound Issues I/O Request I/O Completes Issues I/O Request I/O Completes CPU Bound CPU Bound Time 11

Max-Min Fairness How do we balance a mixture of repeating tasks: Some I/O bound, need only a little CPU Some compute bound, can use as much CPU as they are assigned One approach: maximize the minimum allocation given to a task If any task needs less than an equal share, schedule the smallest of these first Split the remaining time using max-min If all remaining tasks need at least equal share, split evenly Approximation: every time the scheduler needs to make a choice, it chooses the task for the process with the least accumulated time on the processor Multi-level Feedback Queue (MFQ) Goals: Responsiveness Low overhead Starvation freedom Some tasks are high/low priority Fairness (among equal priority tasks) Not perfect at any of them! Used in Linux (and probably Windows, MacOS) 12

MFQ Set of Round Robin queues Each queue has a separate priority High priority queues have short time slices Low priority queues have long time slices Scheduler picks first thread in highest priority queue Tasks start in highest priority queue If time slice expires, task drops one level MFQ Priority Time Slice (ms) Round Robin Queues 1 10 New or I/O Bound Task 2 20 Time Slice Expiration 3 40 4 80 13

Uniprocessor summary (1) FIFO is simple and minimizes overhead. If tasks are variable in size, then FIFO can have very poor average response time. If tasks are equal in size, FIFO is optimal in terms of average response time. Considering only the processor, SJF is optimal in terms of average response time. SJF is pessimal in terms of variance in response time. Uniprocessor summary (2) If tasks are variable in size, Round Robin approximates SJF. If tasks are equal in size, Round Robin will have very poor average response time. Tasks that intermix processor and I/O benefit from SJF and can do poorly under Round Robin. 14

Uniprocessor summary (3) Max-Min fairness can improve response time for I/Obound tasks. Round Robin and Max-Min fairness both avoid starvation. By manipulating the assignment of tasks to priority queues, an MFQ scheduler can achieve a balance between responsiveness, low overhead, and fairness. Multiprocessor scheduling What would happen if we used MFQ on a multiprocessor? Contention for scheduler spinlock Cache slowdown due to ready list data structure pinging from one CPU to another Limited cache reuse: thread s data from last time it ran is often still in its old cache 15

Per-processor affinity scheduling Each processor has its own ready list Protected by a per-processor spinlock Put threads back on the ready list where it had most recently run Ex: when I/O completes, or on Condition->signal Idle processors can steal work from other processors Per-processor Multi-level Feedback with affinity scheduling Processor 1 Processor 2 Processor 3 16

Scheduling parallel programs What happens if one thread gets time-sliced while other threads from the same program are still running? Assuming program uses locks and condition variables, it will still be correct What about performance? Bulk synchronous parallelism Loop at each processor: Compute on local data (in parallel) Barrier Send (selected) data to other processors (in parallel) Barrier Examples: MapReduce Fluid flow over a wing Most parallel algorithms can be recast in BSP * Sacrificing a small constant factor in performance 17

Tail latency Processor 1 Processor 2 Processor 3 Processor 4 Local Computation Barrier Time Communication Barrier Local Computation Scheduling parallel programs Oblivious: each processor time-slices its ready list independently of the other processors Processor 1 Processor 2 Processor 3 p1.4 p2.1 p1.2 Time p2.3 p3.4 p1.3 p3.1 p2.4 p2.2 px.y = Thread y in process x 18

Gang scheduling Processor 1 p1.1 Processor 2 p1.2 Processor 3 p1.3 Time p2.1 p2.2 p2.3 p3.1 p3.2 p3.3 px.y = Thread y in process x Parallel program speedup Perfectly Parallel Performance (Inverse Response Time) Diminishing Returns Limited Parallelism Number of Processors 19

Space sharing Processor 1 Processor 2 Processor 3 Processor 4 Processor 5 Processor 6 Time Process 1 Process 2 Scheduler activations: kernel tells each application its # of processors with upcalls every time the assignment changes Queueing theory Can we predict what will happen to user performance: If a service becomes more popular? If we buy more hardware? If we change the implementation to provide more features? 20

Queueing model Arrivals Departures (Throughput) Queue Server Assumption: average performance in a stable system, where the arrival rate (ƛ) matches the departure rate (µ) Definitions Queueing delay (W): wait time Number of tasks queued (Q) Service time (S): time to service the request Response time (R) = queueing delay + service time Utilization (U): fraction of time the server is busy Service time * arrival rate (ƛ) Throughput (X): rate of task completions If no overload, throughput = arrival rate 21

Little s law N = X * R N: number of tasks in the system Applies to any stable system where arrivals match departures. Question Suppose a system has throughput (X) = 100 tasks/s, average response time (R) = 50 ms/task How many tasks are in the system on average? If the server takes 5 ms/task, what is its utilization? What is the average wait time? What is the average number of queued tasks? 22

Question From example: X = 100 task/sec R = 50 ms/task S = 5 ms/task W = 45 ms/task Q = 4.5 tasks Why is W = 45 ms and not 4.5 * 5 = 22.5 ms? Hint: what if S = 10ms? S = 1ms? Queueing What is the best case scenario for minimizing queueing delay? Keeping arrival rate, service time constant What is the worst case scenario? 23

Best case: evenly spaced arrivals Response Time (R) S λ<μ no queuing R = S λ>μ growing queues R undefined Throughput (X) μ Max throughput Arrival Rate (λ) μ Arrival Rate (λ) μ Response time: best vs. worst case Response Time (R) S λ<μ queuing depends on burstiness bursty arrivals evenly spaced arrivals λ>μ growing queues R undefined Arrivals Per Second (λ) μ 24

Probability of x 10/13/15 Queueing: average case? What is average? Gaussian: Arrivals are spread out, around a mean value Exponential: arrivals are memoryless Heavy-tailed: arrivals are bursty Can have randomness in both arrivals and service times Exponential distribution Exponential Distribution -λx f(x) = λe x 25

Exponential distribution λ λ λ λ λ 0 1 2 3 4... μ μ μ μ μ Permits closed form solution to state probabilities, as function of arrival rate and service rate Response time vs. utilization 100 S 80 S Response Time R 60 S 40 S 20 S R = S/(1-U) 0 0 0.2 0.4 0.6 0.8 1.0 Utilization U 26

Question Exponential arrivals: R = S/(1-U) If system is 20% utilized, and load increases by 5%, how much does response time increase? If system is 90% utilized, and load increases by 5%, how much does response time increase? Variance in response time Exponential arrivals Variance in R = S/(1-U)^2 What if less bursty than exponential? What if more bursty than exponential? 27

What if multiple resources? Response time = Sum over all i Service time for resource i / (1 Utilization of resource i) Implication If you fix one bottleneck, the next highest utilized resource will limit performance Overload management What if arrivals occur faster than service can handle them If do nothing, response time will become infinite Turn users away? Which ones? Average response time is best if turn away users that have the highest service demand Example: Highway congestion Degrade service? Compute result with fewer resources Example: CNN static front page on 9/11 28

Highway congestion (measured) Why do metro buses cluster? Suppose two Metro buses start 15 minutes apart Why might they arrive at the same time? 29