Incentive-Based P2P Scheduling in Grid Computing

Incentive-Based P2P Scheduling in Grid Computing Yanmin Zhu 1, Lijuan Xiao 2, Lionel M. Ni 1, and Zhiwei Xu 2 1 Department of Computer Science Hong Kong University of Science and Technology Clearwater Bay, Kowloon, Hong Kong {zhuym,ni}@cs.ust.hk 2 Institute of Computing Technology Chinese Academy of Sciences, Beijing, China xiaolijuan@software.ict.ac.cn, zxu@ict.ac.cn Abstract. Grid computing has emerged as an attractive computing paradigm recently. In typical grid environments, there are two distinct parties, resource consumers and resource providers, which have different optimization objectives. Enabling an effective interaction between the two parties (i.e., scheduling jobs of consumers across resources of providers) is particularly challenging due to the distributed ownership of grid resources. In this paper, we propose an incentive-based P2P scheduling for grid computing, with the goal of building a practical and robust computational economy. The goal is realized by building a computational market supporting fair and healthy competition among consumers and providers. To build the healthy computational market, we propose the P2P scheduling infrastructure to efficiently support the scheduling, and the incentive-based algorithms for consumers and providers, respectively. 1 Introduction With the rapid development of high-speed wide-area networks and powerful yet lowcost computational resources, grid computing [1] has emerged as an attractive computing paradigm. Computational grids strive to aggregate the computational power of heterogeneous, geographically distributed and dynamic computational resources. These resources are usually administrated by different domains and owned by various instances. Therefore, they are highly autonomous and differ from each other in many aspects, such as scheduling policy, security requirement, performance strategy, and desired objective. Effective scheduling is of fundamental importance. However, due to unique characteristics described above in grid computing, scheduling in grid environments is particularly challenging. In typical grid environments, on one hand, some users (resource consumers) have computational jobs to execute, but they may lack computational resources for their jobs. On the other hand, some resource owners (resource providers) have relatively underutilized resources. It is highly desirable for consumers to schedule jobs across those resources, but the scheduling is significantly complicated by the distributed ownership of grid resources. Consumers and providers are independent from each other, each having its own access policy, scheduling strategy, and optimization objec- H. Jin, Y. Pan, N. Xiao, and J. Sun (Eds.): GCC 2004, LNCS 3251, pp. 209 216, 2004. Springer-Verlag Berlin Heidelberg 2004

210 Yanmin Zhu et al. tive. In human society, economic methods have been widely employed to solve this kind of problems. Recently a few researchers [2, 3] have started to tackle the problem by applying economic methods to grid computing. But the related research is still in its infant and extensive research efforts are still required. Grid environments are dramatically different from the human society. Scheduling based on economic models for grid computing is a highly challenging problem. In this paper, we propose an incentive-based P2P scheduling for grid computing, with the goal of building a practical and robust computational economy. The goal is realized by building a computational market which supports fair and healthy competition among consumers and providers. The market is fully decentralized, in which every participant competes actively and behaves independently for its own benefit. A market is said to be healthy if every player in the market has sufficient incentive for joining the market. To build a healthy computational market, we first propose the P2P scheduling infrastructure, taking the advantages of P2P networks to efficiently support the scheduling. Second, incentivebased algorithms are designed for consumers and providers, respectively, to ensure every participant with sufficient incentive. Detailed simulation results show that our approach is a promising solution to building a healthy and stable computational economy. The rest of the paper is organized as follows. Section 2 presents an overview of closely related work. In Section 3, we formally define the problem and state the performance objectives. In Section 4, we describe the incentive-based P2P scheduling for grid computing in detail. Section 5 presents the simulation results. Section 6 concludes the paper. 2 Related Work In this section, we will give an overview of the closely related work. Emphasis will be put on those papers involving scheduling with economic methods. Buyya et al. [3] presented some economic models, which have been used in the human society for a long time, such as auction model, commodity market model, contract-net model, bargaining model and bartering model. They discussed the possible directions how the economic models from the human society can be applied into grid computing. The discussion, however, is at high level and no implementation is presented. Applying these models to grid computing properly is a major challenge. Computer Power Market (CPM) [4] is a market-based resource management and job scheduling system for grid computing. A CPM is comprised of markets, resource consumers and resource providers. A market is the mediator between consumers and providers, which mediates all the information from both consumers and providers. CPM takes the advantages of real markets in the human society. However, the centralized market server does introduce many limitations, such as single point of failure, limited scalability, and so on. Furthermore, the centralized market server requires additional organizations for regular maintenance. Enterprise [5] is a market-like task scheduler for distributed computing environments. Each client specifies a request for bids which includes the numerical priority

Incentive-Based P2P Scheduling in Grid Computing 211 of the task, which is estimated by the task execution time (i.e., the shortest task has the highest priority). Each idle computer responds with a bid giving the estimated completions time. Enterprise shares with our approach the general scheduling process, whereas enterprise is limited by its design considerations. It targets local area network, and only idle workstations will bid for the jobs. More importantly, Enterprise s objective is to minimize the mean flow time, while our objective is two-fold. The papers described above have attempted to take the advantages of the economic idea for scheduling in grid computing. However, focus has only been put on one party s performance, either consumer or provider. To the best of our knowledge, no successful research has been conducted to build a computational market by guaranteeing the incentive for all participants. 3 Problem Formulation The computational market consists of two interacting parties: resource consumers and resource providers. Resource consumers are demanding of computational resources to perform their computing tasks, referred as jobs, and ready to pay for the completion of the jobs. Resource providers have computational resources and try to sell the computing cycles for profit. The problem is how to schedule consumer jobs to those resources, while guaranteeing every participant in the market gets sufficient incentive to play in the market. The incentives stated here are two folds. For consumers, the incentive means that given two jobs with the identical job length, a consumer who pays more for its job should experience better performance than the one who pay less for its job. Examples for better performance include less deadline missing rate, shorter response time, etc. For providers, the incentive means that the earned profit should conform to the cost of its resources. However, the cost may not correctly reflect the relative weight, since the prices of resources are varying over time. Therefore, in our work we replace the cost with the computational capability to represent the relative capability. It is reasonable because a higher computational capability usually results from a higher cost. A resource consumer can be anyone who has jobs to do and is ready to pay for the completion of the jobs. A job refers to a computational task which is usually computation-intensive. Before a job is able to be executed on the computational grid, some attributes have to be set properly. A job can be characterized by job length, deadline, budget, runtime requirements and data size. The job length refers to the execution time of the job on the standard platform. The budget is the amount of money that the consumer promises to pay for the completion of the job. Each user may have different budget assigning scheme. A resource provider is comprised of the computational resources from one domain. These resources are typically interconnected by a high-speed local network and protected by firewalls from the outside world. A centralized scheduler is deployed to efficiently manage these resources. Resource providers compete actively for jobs from resource consumers and execute them for gaining profits. Every provider tries to maximize its profit based on its computational capability. To estimate the computational capability of a provider, we use the widely-used method [6].

212 Yanmin Zhu et al. The goal is to build a computational economy to enable efficient interaction between consumers and providers. It is to be realized by building a healthy computational market. The health of the market means every player in the market can have sufficient incentive for joining in the market such that the market is stable and lasts. 4 Incentive-Based P2P Scheduling The large scale of the virtual market implies that simplicity, self-organizing, robustness and scalability are important features that the system should possess. To this end, we propose the P2P scheduling infrastructure for organizing the resource consumers and resource providers into a P2P alike network. Taking the advantages of P2P networks, the scheduling infrastructure greatly facilitate the scheduling operation over the distributed grid system. The basic idea is that we try to form a complete competition among all participants. On one hand, given a job request, enough providers will actively compete for the job request. On the other hand, given a provider, enough consumers will compete for the provider s resource. We expect that the incentives of both consumers and providers can be automatically achieved through the complete competition mechanism. In Figure 1, the main steps for executing a job in the computational market are listed. Every job will experience the same steps until its completion. 4.1 P2P Scheduling Infrastructure P2P networks, such as Gnutella [7], and Kazza [8], have been widely used in file sharing for their simplicity, scalability and robustness. In general, there is no centralized controller in P2P networks and each peer is autonomous. To take the advantages of P2P networks, we organize resource providers into a P2P network. The P2P network forms the scheduling infrastructure on which job announcements are forwarded. A consumer initiates a job announcement and sends it into the P2P network. On receiving the job announcement, every provider is required to forward the job announcement to its neighboring providers except the one which forwards this job announcement to it. 4.2 Incentive-Based Consumer Algorithm The behavior of a consumer is characterized by two operations: budget assigning and job offering. Given a job, the budget assigning algorithm is responsible for assigning a proper budget to the job. And the job offering scheme determines which provider the job is offered to from the candidates that replied. Many factors are involved in deciding the budget of a job. These factors can be generally classified into two categories: internal ones and external ones. Internal factors include the job length and the urgent level of the deadline, and external factors include the overall system load and the budget assigning schemes of other consumers in the system.

Incentive-Based P2P Scheduling in Grid Computing 213 Fig. 1. The General Scheduling Procedure. The urgent level of a job is defined as follows. According to the definition, a higher urglev value means more urgently the job is required to be completed. joblength urglev = deadline creationtime It is intuitive that a longer job length requires more computing cycles and therefore a job with longer job length needs more budget. And a higher urgent level implies that the job should precede many other jobs to be executed, and therefore it requires higher priority. To obtain a higher priority, the job has to be given more budget. Thus, the budget is proportional to the job length and job urgent level. The following is the proposed budget assigning scheme for consumers. jobbudget = λ ( a joblen + b urglev ) - jobbudget : job budget - λ : job importance factor - a, b : constant coefficient In addition to the internal factors, the budget assigning algorithm should be aware of the external factors. One basic observation is that when the deadline missing rate increases, a consumer should conclude that jobs were assigned relatively less budget such that the jobs were inferior while competing with other jobs. Therefore, the budget assigning algorithm has to be adaptive to the deadline missing rate. When the deadline missing rate increases, the algorithm should increase importance factor accordingly; otherwise, it should decrease λ. After sending out a job announcement, the consumer waits for a short while and expects to receive a number of replies from those providers meeting the job s deadline. A reply includes the completion time when the provider claims to complete the job. The job offering scheme is responsible for determining to which provider the job is going to be offered. Different consumers may have different optimization objectives, and therefore have different job offering schemes. We have implemented the job offering scheme that a consumer will offer the job to the provider who claims the earliest completion time for the job.

214 Yanmin Zhu et al. 4.3 Incentive-Based Provider Algorithm The behavior of a resource provider is characterized by two operations: competing for jobs and local scheduling. The former operation is responsible for deciding how to compete for jobs, and the later operation is to schedule the local offered jobs, with the aim of maximizing the profit. Once a resource provider is offered with a job, it will perform scheduling for its own benefit only, without taking into account the job s performance. In order to prevent the provider from not keeping the promise it made to the consumer, we propose a penalty model. The basic idea is that a provider will be penalized with an amount of money if the provider could not complete the job before the completion time it promised to the consumer. The amount of penalty is related to the budget of the job and the length of exceeding time. To some extent, the penalty model helps force the provider to keep its promise, but still allows the freedom of local scheduling such that it is able to maximize its profit. The following expression summarizes the penalty model mathematically. p 1 B and p 2 B are the slopes of the penalty lines, where p 1 and p 2 are constant coefficients. In general, p 1 is less than p 2 because for consumers exceeding deadline is more serious than exceeding the claimed completion time. MaxPen represents the maximal penalty that could be posed. 0 T CT p1b(t CT ) CT < T DL penalty = p2b(t DT ) + p1b( DL CT ) DL < T T0 MaxPen T > T0 In the computational market, each provider competes actively for jobs, and tries to maximize its profit. One basic operation of providers is how to respond to job announcements. Ideally, the provider could try to compete for those jobs that will result in the maximal profit. But it is hardly possible because the provider is not able to predict the possible future job announcement arrivals, and also cannot determine whether it will be offered with a specific job. We propose an aggressive competing algorithm for providers. By this algorithm, each provider tries to complete for every job whenever it can satisfy the job s deadline. Whenever a job is completed, the provider has to make a decision which job to execute next. Suppose that there are n offered pending jobs. It is a well-known NP complete problem to schedule the jobs such that the resulting profit is maximized [9]. One simple optimal solution is to investigate each of the n! permutations, compute the profit, and select the permutation which produces the maximal profit. But it is certainly computationally infeasible when n is large. We propose a heuristic local scheduling algorithm, which is computationally efficient and produces near-optimal profit. The basic idea is that for each provider a sorted list of offered pending jobs is maintained with respect to the profit. The ordering of these jobs results in a near maximal profit. The heuristic is that after a new offered job is inserted, the relative order of the original jobs is probably not changed. According to the heuristic, only n+ 1 possible positions for the new job should be investigated. The new offered job will be inserted to the position which produces the maximal profit out of the n+1 posi-

Incentive-Based P2P Scheduling in Grid Computing 215 tions. With the availability of the sorted list of jobs, it is simple to decide which job to execute next. The provider will select the job on the head of the list to execute whenever a previous job is completed. 5 Simulation Results We design the first experiment to study the incentives obtained by consumers with different job importance factor. In this experiment, simulations are performed using the following parameters. There are in total 20 consumers and 80 providers. 10% of the consumers are assigned with job importance factor λ=1.5, 10% of consumers with λ=0.5 and the rest with λ=1.0. The system load is varying from 0.3 to 0.7. Figure 2 shows the resulting incentives in terms of the deadline missing rate with respect to the system load. As seen in this figure, the consumers with higher importance factor really experience less deadline missing rate. Those jobs with relatively higher importance factor will gain relatively higher priority and consequently be executed earlier. This trend becomes sharper when the system load is increasing. Thus, it makes sense that consumers with importance factor λ=1.5 experience minimum deadline missing rate. This experiment demonstrates that our approach achieves to guarantee the incentive of consumers. The second experiment is designed to study the incentive of providers. Before talking about the experiment results, we define the terminology fairness scale of an individual provider. The fairness scale is defined by the following expression. normalized profiti fairness scalei = normalized capability A fairness scale reflects the individual incentive for a provider. The ideal case is that the fairness scale of every provider is one. A fairness scale less than one means that the provider does not make the profit conforming to its computational capability, which leads to the situation that the provider does not get enough incentive. If the fairness scale of a provider remains much less than one, it possibly quits the market. The standard deviation (SD) of fairness scales of providers is employed to study the overall incentive situation of providers. As shown in Figure 3, the SD of fairness scales is fairly good, which is less than 25% of the ideal fairness scale. When the system load is increasing, the SD will be further reduced. This experiment demonstrates that our approach really guarantees the incentive for every provider. i 6 Conclusion Distributed ownership of resources greatly complicates scheduling in grid computing. Enabling the interaction between consumers and providers is highly challenging. In this paper, we have proposed the incentive-based P2P scheduling, aiming at building a decentralized, scalable and robust computational market. In this market, each participant behaves for its own benefit only. However, the computational market is proved to be healthy since each participant is guaranteed to obtain sufficient incentive

216 Yanmin Zhu et al. Deadline Missing Rate (%) λ=0.5 λ=1.0 λ=1.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0.2 0.3 0.4 0.5 0.6 0.7 0.8 System Load Standard Deviation of Fairness Scale 0.50 0.40 0.30 0.20 0.10 0.00 0.2 0.3 0.4 0.5 0.6 0.7 0.8 System Load Fig. 2. Resource Consumer Incentive. Fig. 3. Resource Provider Incentive. for joining the market. Detailed simulation results demonstrate that our approach is successful in building a healthy and scalable computational economy. References 1. Foster, I. and C. Kesselman, The Grid 2: Blueprint for a New Computing Infrastructure. 2003: Morgan Kaufmann Publishers. 2. Shetty, S., P. Padala, and M. Frank, A Survey of Market Based Approaches in Distributed Computing. 2003. 3. Buyya, R., et al., Economic Models for Resource Management and Scheduling in Grid Computing. Special Issue on Grid Computing Environments, Journal of Concurrency and Computation: Practice and Experience, 2002. 14(13-15): p. 1507-1542. 4. Buyya, R. and S. Vazhkudai. Compute power market: Towards a market-oriented grid. in the First International Symposium on Cluster Computing and the Grid. 2001. 5. Malone, T.W., et al., Enterprise: A market-like task scheduler for distributed computing environments, in The Ecology of Computation, B.A. Huberman, Editor. 1988, Amsterdam: north-holland. p. 177-205. 6. The Standard Performance Evaluation Corporation (SPEC) Home Page, http://www.specbench.org/. 7. Gnutella Homepage, http://gnutella.wego.com, April, 2004. 8. Kazaa Homepage, http://www.kazaa.com, April, 2004. 9. Gonzalez, M.J., Deterministic Processor Scheduling. ACM Computing Surveys, 1997. 9(3): p. 173-204.