Grid Computing Scheduling Jobs Based on Priority Using Backfilling

Size: px
Start display at page:

Download "Grid Computing Scheduling Jobs Based on Priority Using Backfilling"

Transcription

1 Grid Computing Scheduling Jobs Based on Priority Using Backfilling Sandip Fakira Lokhande 1, Sachin D. Chavhan 2, Prof. S. R. Jadhao ME Scholar, Department of CSE, Babasaheb Naik College of Engineering, Pusad 3 Associate Professor, Department of CSE, Babasaheb Naik College of Engineering, Pusad 1 sandy.lokhande123@gmail.com, 2 Chavhan_sachin@ymail.com Abstract: Grid Computing has emerged as an important new field focusing on resource sharing. One of the most challenging issues in Grid Computing is efficient scheduling of tasks. Grid computing is collection of computer resources from multiple locations to reach a common goal. since this technology allows aggregation of various computer systems for usage by many users to run software s, the data stored on it which may be sensitive and private, remains vulnerable. Scheduling independent jobs to the resources is not an easy task. Scheduling schedules maximum number of jobs to the minimum amount of resources which is a very tedious task. So the high priority job can be submitted to the corresponding resources based on the computing power and it helps to improve the performance and reduce the completion time of the job.in this paper, we propose a Priority and Backfilling-FCFS which is a scheduling algorithm based on FCFS to solve the problem of scheduling independent tasks in a grid computing system. It scheduled the task by minimum completion time and rescheduled by waiting time of each task to obtain load balance. Load Balancing is a technique to improve parallelism, utilization of resources increasing throughput managing and to reduce response time through proper distribution of the tasks. Keywords: Grid Computing, FCFS, Backfilling, priority etc. I. INTRODUCTION A great number of large scale applications in science, engineering and economy are performed on parallel systems in order to achieve high performance computing. Grid computing is collection of computer resources from multiple locations to reach a common goal. A great number of large scale applications in science, engineering and economy are performed on parallel systems in order to achieve high performance computing. A type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed autonomous and heterogeneous resources dynamically at runtime depending on their availability, capability, performance, cost and users quality-of-service requirements [8]. Grid computing has emerged from distributed computing where the resources of many computers in a network are used are used to solve a single problem at the same time. The process that needs a large number of computer processing cycles can use this technology to solve their problem. There are many issues in using grid computing. How to appropriately and efficiently assign resources to tasks, generally called job scheduling, is one of the important issues. The main purpose of job scheduling is to shorten the job completion time and enhance the system throughput. These scheduling algorithms can allot tasks dynamically according to the changes of processor and the tasks, and divide them with different priorities based on their own strategies, i.e., the tasks with higher priority will be submitted when the resources are relatively free However, these scheduling strategies of scheduling algorithms not only need to provide information of tasks and processor, but also have insufficiencies at such aspects as system utilization rate, response time etc. In order to improve the system utilization rate and shorten the response time, this paper presents PBFCFS algorithm (Priority and Backfilling-FCFS) which is a task scheduling algorithm based on FCFS (First Come First Served) and backfilling strategy for grid computing. When there are insufficient resources of processor, the backfilling strategy will be applied to insert with fewer resource requirements into the processing list from waiting tasks.[1]in a grid environment, the resource providers and tasks changing constantly, so the traditional scheduling algorithms, e.g. First Come, First Serve may not be suitable for a dynamic grid system. In order to improve the system utilization rate and shorten the response time, The PB-FCFS algorithm adopts the dynamic priority strategy in task scheduling, every time, the scheduling task list will be reordered by priority strategy and then by FCFS strategy[1]. When there are insufficient resources of processor, the backfilling strategy will be applied to insert with fewer resource requirements into the processing list from waiting tasks. Combinational backfilling scheduling is that according to the state of the free resources, it selects a group of small jobs to backfill the resources gap to maximize the utilization of resources in clusters. There are two special cases for CBA scheduling strategy [2]. Firstly, when there are no combinational jobs in the waiting jobs queue, the degradation of CBA is EASY backfilling. Secondly, when there are no jobs in the waiting jobs queue, the degradation of CBA is FCFS scheduling algorithm. The results of experiments show that the CBA algorithm improves the utilization of resources in clusters compared with EASY backfilling and FCFS algorithm [2] Load balance is also an important issue in grid environment. The main purpose of load balance is to balance the load of each resource in order to enhance the resource utilization and increase the system throughput. The main goal of grid is providing services with high reliability and lowest cost for large volumes of users and support group work and the most important issue in grid computing are resource management and control, 68

2 reliability and security. In traditional scheduling approaches at grid computing, scheduling time to complete tasks is considered as the most important parameter, while the timing of the economic schedulers should also implement time jobs, cost of resource use is considered. 69 Fig 1. Grid computing In order to perform the scheduling process, the grid scheduler has Collecting information of jobs submitted to the grid, then collecting available resource information, as well as computation of the mapping of jobs to selected resources, jobs allocating according to the mapping, and monitoring of job completion. II. LITERATURE REVIEW In this literature review, we will provide a review of a limited number of studies related to the grid computing with resource management and scheduling. Hong JIANG & Tianwei NI, 2009 proposes a grid computing by using the PB-FCFS task scheduling algorithm is based on FCFS and backfilling strategy. The PB-FCFS algorithm combines backfilling and priority scheduling strategy. To show that the PB-FCFS algorithm are run shorten time and improves the system throughput rate and resource utilization rate. In the paper, we made a comparison and analysis between the PB-FCFS algorithm and the EASY-backfill algorithm. In order to verify the performance of the PB-FCFS algorithm, we use two indicators to analyse, both the task execution time and the system throughput rate [1]. Shengwei YiI, Zhichao WangI, et al 2010 present a Combinational backfilling method for parallel job scheduling in clusters is proposed. In this method select multiple jobs combined from the waiting job queue to backfill to maximize the use of idle resources. A comparison with FCFS, EASY backfilling algorithm is used. Combinational backfilling algorithm improves the utilization of resources in clusters compared with EASY backfilling and FCFS algorithm [2].Zafril Rizal M Azmi et al, 2011 Improved backfilling technique which not only applied to FCFS but to others Priority Rule algorithms. Gap filling or backfilling is one of the techniques used to optimize First Come First Serve (FCFS) and have been used widely in scheduling area [3]. Deeptanoy Ghosh, Ramandeep et al, 2013 The proposed technique will give the user a single view where the user can log into the Grid resource management system, submit jobs and check the status of his jobs just like he does when he submits a job to any single resource [4]. S. K. Karthikumar, M. Udhaya Preethi et al 2013.proposed approach for load balancing and scheduling. The Throughput and Performance of the grid environment will greatly improve by an optimal load balancing approach [7]. Perform is responsible the load balancing. After load balancing at group level (if load is not balanced at group level) it will be balanced at region level and after balancing at region level (if load is not balanced at region level) it will be balanced at grid level [8]. III. GRID SCHEDULING ALGORITHMS In principle, scheduling in grids means two things ordering and mapping. When there are more than one jobs waiting for execution, ordering is performed in order to determine by which order the pending jobs are arranged. Mapping is the process of selecting a set of appropriate resources and allocating the set of resources to the grid jobs. Nature of facing with the data in large scale and distributed in grid create challenges. One of the important challenges is how to increase the performance of grid system by effective scheduling tasks. Scheduling policies play a great role in grid for distributing jobs to the appropriate resources. Effective scheduling methods could reduce job execution time and increase the efficiency of the grid. Job scheduling defines as assigning a job to specific resource to be perform. For each job that submitted to grid environment. In order to perform the scheduling process, the grid scheduler has to follow a series of steps: 1. Collecting information of jobs submitted to the grid, 2. Collecting available resource information, 3. Computation of the mapping of jobs to selected resources, 4. Jobs allocating according to the mapping, 5. Monitoring of job completion. A. FCFS Algorithm: The FCFS (first come first served) algorithm is widely used scheduling algorithm in grid computing but it doesn t remove some major performance issues such as high waiting time and less throughputs. The utilization of system resources is not optimized by using FCFS scheduling method. If there is a job which has a bigger time of execution, may use the resources of the system for a long time. In this condition the other resources would be idle and would not be used by other jobs [13]. First come first served

3 (FCFS) is the scheduling algorithm which is most simple to implement and to understand. Priority scheduling is the scheduling algorithm which assigns a certain priority to each task. In FCFS scheduling algorithm, new set of jobs are added in the job scheduling queue. Proposed algorithm in does not care about the increased priority of a job because limiting factor of increasing the priority is the maximum priority [8].It uses First in- First out (FIFO) strategy to assign the priority to processes in the order, that is same as the request made by process for the processor. Firstly, task will be selected and run orderly by FCFS strategy when entering the scheduling centre for the first time. If it does not have sufficient resource and task priority level, the task priority level is set as initial value by the scheduler, otherwise increase task priority level dynamically until the highest priority level. Secondly, the remaining tasks will re-enter the loop scheduling process when a task implement is accomplished. Firstly, the task is selected and run orderly in terms of priority level. For tasks with the same priority level, the selection and implement of one task is according to FCFS strategy. Otherwise the scheduler will continue to increase task priority level until the highest priority is reached. Finally, if there is insufficient resource, the first task in waiting queue must wait for the accomplishment of tasks in running queue and as well release resource [10]. Immediately, the backfilling strategy is applied to insert some tasks with fewer resource requirements from waiting queue into the running queue [6]. B. Backfilling Algorithms: A parallel job scheduling algorithm for clusters usually works in two phases. In the first phase, a queue-priority policy is used to prioritize the submitted jobs in a batch queue. In the second phase, a computer-allocation policy is then employed to decide when the jobs are scheduled and how the computers are allocated. If the next queued job cannot run because sufficient processors are not available, the scheduler nevertheless continues to scan the queue, and selects smaller jobs that may utilize the available resources. In select jobs for execution in first come-first-serve (FCFS) order, and run each job to completion. The problem is that this simplistic approach causes significant fragmentation, as jobs do not pack perfectly and processors are left idle. After the scheduler starts all jobs at the top of the priority queue that it can and reserves nodes for the first job that it cannot start, it enters the second scheduling phase, the Backfill Scheduling [6]. The concept of backfill is simply to schedule those jobs which can be run that will not delay the start of jobs ahead of them in the priority queue [13]. There are two major backfilling strategies: Aggressive backfill (i.e. EASY backfill) conservative backfill With conservative backfilling, each job is given a reservation when it arrives and jobs are allowed to move ahead in the queue in the condition that they do not cause any queued job to be delayed beyond their reserved start-time. Conservative backfilling moves jobs forward only if they do not delay any previously queued jobs. Conservative backfilling can backfill a small job in the back of the queue as long as it does not delay the start time of all the jobs in the front of it. Each job in the queue can get the latest start time reliably, and the resource utilization is improved. With EASY, only the job at the head of queue is given a reservation and other jobs are allowed to move ahead as long as they do not delay that job. EASY-backfilling takes a more aggressive approach and allows jobs to skip ahead provided that they do not delay the job at the head of the queue. EASY backfilling selects a small job to backfill if it does not delay the start time of the first job in the queue. The resource utilization is improved. The small jobs will be able to get more opportunities for backfilling. C. Priority Scheduling: Priority will be selected and run orderly by FCFS strategy when entering the scheduling centre for the first time. If it does not have sufficient resource and priority level, the priority level is set as initial value by the scheduler. Otherwise increase task priority level dynamically until the highest priority level. The task is selected and run orderly in terms of priority level. For tasks with the same priority level, the selection and implement of one task is according to FCFS strategy [6]. Otherwise the scheduler will continue to increase priority level until the highest priority is reached. Immediately, the backfilling strategy is applied to insert some tasks with fewer resource requirements from waiting queue into the running queue by using priority in execution time or deadline. IV. RELATED WORK There are several algorithms and approaches in the area of job scheduling in in grid. A grate number of scheduling strategies have been developed to bring grid scheduling issues to a more advanced level of efficiency. Some of the current job scheduling algorithms is: Minimum Execution Time, in this method minimum execution time is used to assign the job without considering the resource availability. Job is assigned to the resource on which it can be executed in minimum time. Allocation job without considering resource availability results less resource utilization. Minimum Completion Time, in this approach job is assigned to the resource that gives minimum completion time for the job. Job scheduling strategy is an important component of job scheduling in grid system. 70

4 It selects resources in first come first serve order and multiplies the granularity size to increase the resource computation time. The algorithm reduces the total processing time and cost of the jobs, it also maximizes the resource utilization. A first-come-first-served (FCFS) parallel job scheduling method is provided in, the most basic, simple and common method of job scheduling in system. According to the time order of jobs submitted by users, FCFS selects always the first job from the waiting job queue to map the appropriate computing resources and execute it. The system resource utilization is not high by the FCFS scheduling method. A long and large job in the front part of the waiting job queue may take up computing resources for a long time, while other computing resources are idle not to be used by other jobs. The main aim of our proposed work is to make develop scheduling jobs based on priority using backfilling for grid computing. In the proposed approach based on scheduling jobs based on priority using backfilling for grid computing. In this paper to overcome above limitation we introduce FCFS and backfilling techniques Most of the algorithms in system. To resolve the fragmentation problem in FCFS, the Backfilling is proposed that eventually brings about a major improvement in the performance of resource. In this method, we can create four modules for implementing algorithm. First module is designing model for waiting list in our system. Second module is implemented for finding the execution time, arrival time and priority all job are in the queue. Then third module are introduce all jobs are arranging according priority. Last module is applying load balancing and checking throughput rate as well as resource utilization rate is improved. We used grid sim tool for simulating method. Evaluations show because resources don t allocate to total processes of a job simultaneously, it is possible that response time be more than backfilling algorithm for proposed jobs in head of queue. The job that requests allocated the first and other if in the queue has to wait until the process is free. 71 V. CONCLUSION Grid computing is one of the major key concerns to develop the computing system that have the major ability to self-configuration and optimization.we are proposing a backfilling scheduling algorithm is proposed to select many jobs to be backfilled from the waiting job queue.the main aim of our proposed work is to make develop scheduling jobs based on priority using backfilling for grid computing. VI. REFERENCE [1] Hong Nang, Tianwei NI, PB-FCFS A Task Scheduling Algorithm Based on FCFS and Backfilling Strategy for Grid Computing, IEEE. [2] ShengweiYiI, ZhichaoWangI, Shilong Mal, etc. Huang2, Combinational Backfilling for Parallel Job Scheduling, 2nd International Conference on Education Technology and Computer (ICETC), IEEE. [3] Zafril Rizal M Azmi, Kamalrulnizam Abu Bakar et al, Scheduling Grid Jobs Using Priority Rule Algorithms and Gap Filling Techniques, International Journal of Advanced Science and Technology Vol. 37, December, [4] DeeptanoyGhosh, Ramandeep et al, An Efficient Resource Management and Scheduling Technique for Fault Tolerance in Grid Computing IJCSMC, Vol. 2, Issue. 7, July 2013, pg [5] S.K.Karthikumar, M.UdhayaPreethi et al, Fair Scheduling Approach for Load Balancing and Fault Tolerant in Grid Environment ICECCN IEEE. [6] Marish Kr. Singla, Task Scheduling Algorithms for Grid Computing with Static Jobs: A Review International Journal of Computer Science Engineering (IJCSE) Vol. 2 No.05 Sep [7] LalShriVratt Singh, Jawed Ahmed, Asif Khan, An Algorithm to Optimize the Traditional Backfill Algorithm Using Priority of Jobs for Task Scheduling Problems in Cloud Computing, International Journal of Computer Science and Information Technologies(IJCSIT), Vol. 5 (2), 2014, [8] Dipti Sharma Mr.Pradeep Mittal, Job Scheduling Algorithm for Computational Grid in Grid Computing Environment, International Journal of Advanced Research in Computer Science and Software Engineering 3(5), May , pp [9] R.Manimala, P.Suresh, Load Balanced Job Scheduling Approach for Grid Environment, [10] Adam K.L. Wong and Andrzej M. Goscinski, The Impact of Under-Estimated Length of Jobs on EASY-Backfill Scheduling, 16th Euro micro Conference on Parallel, Distributed and Network- Based Processing (2008) IEEE. [11] Bo Li, Jun Chen, Man Yang, Erfei Wang, Impact of extending the runtime of underestimated jobs in backfilling schedulers 2008 International

5 Conference on Computer Science and Software Engineering ICCSSE. IEEE [12] Bo Li, et al, Scheduling of a Relaxed Backfill Strategy with Multiple Reservations, 2010 IEEE 11th International Conference on Parallel and Distributed Computing, Applications and Technologies. [13] S. Gokul Dev et al, Job Scheduling in Grid Computing using User Deadline, IJCSIT