A WORKLOAD GENERATOR FOR DATABASE SYSTEM BENCHMARKS. Hoe Jin Jeong and Sang Ho Lee

Size: px
Start display at page:

Download "A WORKLOAD GENERATOR FOR DATABASE SYSTEM BENCHMARKS. Hoe Jin Jeong and Sang Ho Lee"

Transcription

1 A WORKLOAD GENERATOR FOR DATABASE SYSTEM BENCHMARKS Hoe Jin Jeong and Sang Ho Lee School of Computing, Soongsil University, Seoul, Korea and Abstract There are cases in which database system benchmarks should be executed in a realistic environment to get meaningful performance results. We propose a workload generator that helps benchmarks be executed in an environment similar to the real world in terms of resource status. The workload generator can create memory-bound, CPU-bound, and I/O-bound workloads. The workload generator allows users to create a composite workload, which is similar to real workload users run across in practice. Finally, using TPC-C and the Wisconsin benchmark, we conduct an experiment to show the feasibility of the proposed workload generator. 1. Introduction There have been a number of database benchmarks in the literature. TPC-C [8] is one of the representative benchmarks in the area of online transaction processing. The BORD benchmark [6] and the Wisconsin benchmark [3] are representative benchmarks in the area of decision support systems. TPC-W [8] is designed for electronic commerce applications. Database benchmarks can be classified into two categories, namely generic benchmarks and custom benchmarks [4]. A generic benchmark is created to represent a commonly perceived paradigm of use in an application domain. A custom benchmark is created by a particular customer on the basis of a specific application. Most of well-known database benchmarks as well as the aforementioned benchmarks are generic benchmarks. Irrespective of the database benchmark types, database benchmarks are usually executed with no applications simultaneously running to get the best result. This benchmark environment is much different from the real-world environment in which a number of applications with various workloads are running at the same time. In particular, custom benchmark results obtained in such unrealistic environments are unlikely to represent a meaningful performance yardstick of database systems under test in the specific environment. End users often suggest that database benchmarks run in a realistic environment to get meaningful performance results. We need to have a workload generator that can generate a realistic workload in terms of resource status. A workload generator generates the workloads in two ways [1]. One is an analytic approach, which uses mathematical models (such as the Poisson distribution) to simulate the behaviors of users or the characteristics of specific workloads. The other is a trace-based approach, which collects data

2 on resource status and generates workloads based on the collected information. Most workload generators have been built under the analytic approach. Busari and Williamson [2] proposed ProWGen, a synthetic workload generator for the evaluation of web proxy caches. ProWGen uses mathematical models to generate workloads simulating the features of web page references they defined. Jin and Bestavros [7] developed GISMO, a generator of Internet streaming media objects and workloads. GISMO generates realistic, scalable request streams for benchmarking Internet streaming media delivery techniques. Kao and Iyer [5] proposed a user-oriented synthetic workload generator that simulates the file accessing behaviors of users on the basis of real-world workload characterizations. Their workload generator was designed for the experiments and simulation of file systems. The workload generators described above generate particular workloads of certain applications, not general workloads commonly found in real-world applications because the objectives of the previous workload generators are to generate workloads for particular benchmarks or applications. It is unsuitable to use these workload generators for generation of realistic workloads because realistic workloads are very diverse. The real-world workloads are often too complicated to be modeled by mathematical models and the workloads are subject to dramatic change in unexpected ways. We propose a workload generator for database system benchmarks. The workload generator generates synthetic workloads similar to realistic workloads in terms of resource status. The workload generator uses the trace-based approach. We design the workload generator to run independently so that the workload generator is applicable to any benchmarks for database systems. The workload generator generates the memory-bound, CPU-bound, and I/O-bound workloads as well as a combination of the three workloads. Finally, using TPC-C and the Wisconsin benchmark, we conduct an experiment to show the feasibility of the proposed workload generator. The remainder of this paper is organized as follows. Section 2 introduces the design decisions, workloads, and their generation methods. The experimental results of the workload generator are presented in section 3. Section 4 contains closing remarks. 2. The Workload Generator We have made the design decisions to fulfill the following requirements that emerge from the practical point of view. - A workload generator should be flexible in terms of generating workloads. A workload generator should be able to generate a wide spectrum of workloads to be applicable in real applications. The workload generator described in this paper uses the trace-based approach because we do not think that a handful of mathematical models can represent all real workloads in practice. - Specification of workloads should be simple and straightforward because otherwise a workload generator would not be used in practice. Also, the arguments to specify a workload should be easily accessible. Since all the workloads which any applications can create are known to be composed of three types of workloads (i.e., memory-bound, I/O-bound, CPU-bound), the workload generator generates all

3 three kinds of workloads. In addition, the workload generator produces composite workloads, which are a combination of the three individual workloads. In order to generate a workload, the user should specify the workload arguments. The workload arguments consist of three components that represent a memory-bound workload, an I/O-bound workload, and a CPU-bound workload. The generation module consists of a memory-bound, an I/O-bound, and a CPU-bound module. The three modules can be executed independently or in chorus. The modules generate the workloads by forking user processes to consume the resources of an operating system directly. It is difficult for a user process to control the operating system resources precisely. Hence, we use a notion of tolerance margin that represents a gap between the users intended workload and the actual, generated workload. The values of the tolerance margin denote that we can generate the memorybound, I/O-bound, and CPU-bound workloads within the tolerance margins owing to our workload generation algorithms. Our workload generator adjusts the generated workload dynamically in a feedback fashion. The workload is achieved and maintained through analyzing various information on the resource status at a fixed interval and also controlling the number of processes that generate workloads. The information can be different depending on the workload types. Our workload generator is designed to work for small or medium UNIX systems. 2.1 Memory-bound Module As Figure 1 shows, a memory-bound module generates the memory-bound workload by causing the operating system to exchange data between a physical memory and a swap memory due to the virtual memory scheme. The memory-bound module forks a number of processes that compete against each other to secure a space in the physical memory of which the operating system is in charge. In the module, the total memory all the forked processes attempt to secure is adjusted to be one and half times larger than the physical memory. Our experiments showed that the competition for securing the physical memory between the forked processes took place only when the total memory size was one and half times larger than the physical memory. At start, a process requests a memory of 200 kilobytes, which is subject to adjustment in accordance with the difference of the requested workload and the generated workload. The number of forked processes is also subject to change accordingly. The forked process may get into a sleep state to cool down the currently generated workload. Figure 1. The generation process of the memory-bound workload module Users express the memory-bound workload by using the io wait value in the top command. The control module tries to achieve the requested workload as close as possible by controlling the

4 number of processes, the amount of memory to request, and the sleep time of a process. Given the users requested workload, we are able to generate the workload under the 5% tolerance margin. The 5% tolerance margin was determined by many experiments carried out in various experimental environments. 2.2 I/O-bound Module We generate an I/O-bound workload by forking a number of processes that issue file-related operations. Each process is supposed to copy files, delete files, and pause for a while. Figure 2 shows the generation process of the I/O-bound workload. We use four different file sizes in our module, and the size of the smallest file is determined to be the one that causes the io wait value in the top command to be 3%. The reason the forked process pauses after the file operation is to reduce the CPU-bound workload that is inevitably produced at the same time. We adjust the generated workload dynamically by changing, first the sizes of files under operation, second the number of forked processes, and lastly the period for which a process pauses. Figure 2. The generation process of the I/O-bound workload module Users express the I/O-bound workload by the io wait value of the top command, which is also used to represent the memory-bound workload. When users want to generate both the memorybound workload and I/O-bound workload simultaneously, we generate the memory-bound workload first, and then the I/O-bound workload in order. This approach allows us to define two different workloads by using the same argument, which is easy to understand as well as to implement. The generated workload can be maintained within the 3% tolerance margin of users requested workload. The tolerance margin in the I/O-bound workload module is 3%, because the size of the smallest file for generating the I/O-bound workload makes the I/O-bound workload be 3%. 2.3 CPU-bound Module We generate a CPU-bound workload by forking a number of processes that compute a CPUintensive job, a simple arithmetic computation. Figure 3 presents our approach to generate the CPU-bound workload.

5 Figure 3. The generation process of the CPU-bound workload module Each forked process executes a unit operation that is composed of a simple arithmetic calculation and a pause. The forked process performs the simple arithmetic calculation as many times as the number specified by (1). number of CPU CPU clock speed (MHz) system clock speed (MHz) correction factor (1) The number of CPU, the CPU clock speed, and the system clock speed are acquired by executing the system command such as the prtdiag command. The correction factor, which is used to make a workload gap between the current workload and the requested workload be about 1%, is an empirical value derived from experiments. The procedures to obtain the appropriate correction factor are as follows. - The correction factor starts with The user executes the CPU-bound workload generation and measures the generated workload when the correction factor is one and half times larger than the start value. - If the gap between the measured workload and 1% is greater than or equal to 0.01, the user increases or decreases the previous correction factor. The user repeats this adjustment until the workload gap is less than Users express the CPU-bound workload by a single number that corresponds to summation of the user value and the kernel value in the top command. The user and kernel values indicate the amount of CPU used in the user mode and in the kernel mode, respectively. We can generate the workload within the 1% tolerance margin of the requested workload. The 1% tolerance margin was determined by many experiments carried out in various experimental environments. 3. Experimental results Our experiments consist of two phases; the first phase is to generate each workload alone and the second phase is to generate a combination of three individual workloads. The second experiment is to see if the workload generator can produce the realistic workload that is good enough to be used in practice. In these experiments, we used the Enterprise E3500 server that is one of entry-level servers of SUN Microsystems, Inc. The operating system was SunOS 8. The server was equipped with two 400MHz processors, 2 gigabytes main memory, and 160 gigabytes hard disks.

6 Figure 4. The memory-bound workload generation. (a) CPU versus time (b) Swap I/O versus time Figure 4 shows the experimental result of memory-bound workload. The given workload value was 30%. The total memory a number of processes were trying to secure was 3 gigabytes, which is one and half times larger than the physical memory size (2 gigabytes). As shown in Figure 4(a), we were able to maintain the memory-bound workload to be 34.77% on average, which is within the tolerance margin. The Figure 4(b) illustrates the total amount of data exchanged between a physical memory and a swap memory. The values of swap read have been virtually the same as the ones of swap write since the requested workload was generated (here at 12 minutes). During the first approximate five minutes, the operating system was only loading data into the physical memory, so the exchange of data pages did not take place at all. Figure 5. The I/O-bound workload generation. (a) CPU versus time. (b) Disk writes versus time For generation of the I/O-bound workload, we used four files with different sizes: 5 megabytes, 10 megabytes, 15 megabytes, and 20 megabytes. When the I/O-bound module used 5 megabytes and 10 megabytes files for copy and delete operations, it paused one second to diminish significantly the CPU overhead that was inevitably generated at the same time. With 15 megabytes and 20 megabytes files, the I/O-bound module did not need to pause, because there was little concurrent CPU overhead in our experiments. Figure 5 shows the experimental result of I/O-bound workload. The requested workload was 50%. The workload shown in the Figure 5(a) was 51.01%, which satisfies the requirement. Figure 5(b) shows that disk write operations were properly executed during generating the I/O-bound workload. After the initial ramp-up period, the generated

7 workload was maintained to be stable, which implies that the generated workload was well adjusted dynamically. For the CPU-bound experiment, we used 2.45 as a correction factor and one second as a pause time. The simple arithmetic calculation was done 196,000 times by formula (1). The given workload value was 83%. As shown in Figure 6, the CPU-bound workload was maintained to be 83.16% on average, which is within the tolerance margin. The amount of CPU used in the kernel mode was trivial in this experiment, so the kernel value is not illustrated in the Figure 6. Figure 6. The CPU-bound workload generation. For the composite workload experiment, we did as follows: - running the Wisconsin benchmark under the TPC-C workload - running the Wisconsin benchmark under the generated workload The objective of this experiment is to see if the workload generator can generate a workload composed with the memory-bound, CPU-bound, and I/O-bound workloads. The generated workload should be similar with the TPC-C workload, so that the performance results of the Wisconsin benchmark should be similar. Here, we treat the TPC-C workload as a real workload and the Wisconsin benchmark as a database benchmark. Because our workload generator uses the trace-based approach to generate workloads, we first have to collect information on resource status while the real TPC-C workload is running. The information on the resource status can be collected by issuing system commands such as top, iostat. Table 1 shows the collected information on the resource status of the real TPC-C workload and the generated workload (the Ours column) in terms of the number of warehouses in the TPC-C benchmark. Table 1 also presents the difference between the real TPC-C workload and the generated workload. Note that all the differences between TPC-C column and Ours column are within the tolerance margin. Table 1. The results of the TPC-C workload and the workload generator user + kernel Scale factor difference io wait (%) difference (%) (%) (%) TPC-C Ours TPC-C Ours 5 warehouses warehouses Finally, we ran the Wisconsin benchmark under the real TPC-C workload and the synthetically generated TPC-C workload. Figure 7 illustrates the elapsed times of the queries of the Wisconsin

8 benchmark under the two workloads. If the synthetically generated workload were identical with the TPC-C workload, the two graphs would be virtually of the same. The two graphs in Figure 7 have a similar shape. The main reason of this discrepancy is due to the fact that TPC-C and the Wisconsin benchmark used a database system heavily, but the workload generator did not. Figure 7. The results of the Wisconsin benchmark 4. Conclusion We propose a workload generator that helps database benchmarks be executed in a realistic environment. This paper presents our approach and experimental results. The workload generator can generate the memory-bound, CPU-bound, and I/O-bound workloads as well as composite workloads similar to real-world workloads. Our workload generator can be applied to various areas including databases. This paper presents the preliminary results of our workload generator. We explore the idea that we can generate a broad range of workloads by composing three different workloads, and we believe our approach works out well. A great deal of experiments should be performed to refine and verify our method in the future. Acknowledgements This work was supported by Korea Research Foundation Grant. (KRF-2004-D00172) References [1] BARFORD, P. and CROVELLA, M., Generating Representative Web Workloads for Network and Server Performance Evaluation, Proceedings of the 1998 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, pages , [2] BUSARI, M. and WILLIAMSON, C., ProWGen: A Synthetic Workload Generation Tool for Simulation Evaluation of Web Proxy Caches, The International Journal of Computer and Telecommunications Networking (Computer Networks), 38(6), pages , 2002.

9 [3] DEWIT, D., The Wisconsin Benchmark, in: Past, Present, and Future, Gray, J. (ed.), The Benchmark Handbook, Morgan Kaufmann, pages , [4] HOHENSTEIN, U., PLESSER, V., and HELLER, R., Evaluating the Performance of Object- Oriented Database Systems by Means of a Concrete Application, Proceedings of the 8th Database and Expert Systems Applications Workshop, pages , [5] KAO, W. and IYER, R.K., A User-Oriented Synthetic Workload Generator, Proceedings of the 12 th International Conference on Distributed Computing Systems, pages , [6] LEE, S.H., KIM, S.J., KIM, W., The BORD Benchmark for Object-Relational Databases, Proceedings of the 11 th Database and Expert Systems Applications Conference, pages 6-20, [7] JIN, S. and BESTAVROS, A., GISMO: A Generator of Internet Streaming Media Objects and Workloads, ACM SIGMETRICS Performance Review, 29(3), pages 2-10, [8] Transaction Processing Performance Council,