Capacity Planning for Web-Based Internet Business Services

Size: px
Start display at page:

Download "Capacity Planning for Web-Based Internet Business Services"

Transcription

1 BY ERIC D. HO AND MICHAEL E. MATCHETT Capacity Planning for Web-Based Internet Business Services This case study is designed to illustrate the appro a ch and methodology of a performance and capacity study of a real-world Internet-based application. TECHNICAL SUPPORT June 97 BACKGROUND A West Coast bank undertook a study to eva l u ate the perfo rmance and cap a c i t y requirements of its Internet Banking environment. The goal of this study was to use an a n a lytical modeling technique to inve s t i gat e the performance and capacity requirements of the application and its ability to increase the user base and workload volume in a cl i e n t / s e rver env i ronment while maintaining consistent perfo rmance objectives. Th e fo l l ow i n g questions were important: What size servers are needed to support the anticipated online users under norm a l usage/heavy usage? With substantial and rapid business growth, how long will the initial system be able to sustain its performance objective? Can performance bottlenecks be determined (e.g.,, I/O, memory, network, or the mainframe application processing)? Internet Banking Application Architecture The Internet Banking ap p l i c ation is a mu l t i - tier client/server application that facilitates banking transactions over the Internet. The tiers of servers include: firewall servers, gateway servers, session servers, Object Request Broker (ORB) servers, database servers, and mainframe servers. The ORB serve rs are perfo rmance s e n s itive. Other servers act as pass-through for the Internet banking transactions. After first analyzing each client/server tier as a separate model, we focused on the i n t e ractions between the session host, called, and the pair of ORB servers, and castle. These systems contain most of the Internet Banking application logic and, from our preliminary analysis, are subjected to larger performance degradations as the load increases more than other system components. The session server has Internet-related processes running on it, such as ns-httpd, h t t p d, and c gi ORB clients. The ORB servers have ORB services, including: bkserv - banking services; otserv - online transaction services; pwserv - financial auditing and access services; xaserv - web customer authentication services; Performance Modeling Concept Performance modeling of client/server applications is a practical and proven technique for evaluating potential performance bottlenecks. Key characteristics of client/ server application modeling are: A system model can be constructed that represents the structure of the client/server systems; workloads are defined to represent the different types of transactions. Client/server performance models can generate results either analytically b y solving complex mathematical equations, or by using simulation to mimic the behavior of the system. The analytical modeling technique uses queuing equations to compute a number of performance measurements (e.g., response time and throughput) based on the resource contention of various resources (, I/O,

2 memory, etc.). This approach is far more cost-effective in both time and resources than simulation techniques and delivers a high degree of accuracy. The cornerstone of the study was based upon a commerc i a l ly ava i l able perfo r- mance management and capacity planning (PM/CP) tool. This tool provides the ve h i cl e to model the Internet Banking application with an analytical queuing modeling engine, technique, and approach. The heart of this study was to model the application at a logical level with basic building blocks of workloads and servers. This is in contrast to a model developed at a physical level dealing with client workstation hardware, specific t o ken ring and/or Ethernet LAN components, routers, bridges, wide area network (WAN) links, and various types of internal server and disk configurations. At the logical leve l, the server component has para m e t e rs rep resenting the number of p ro c e s s o rs, the processing rat e, m e m o ry size, n e t wo rk access method, and bandwidth. Th e wo rkload component has at t ri butes rep re s e n t- ing transaction types, t ransaction time (pat h l e n g t h ), number of disk and netwo rk I/Os, number of users, and transaction vo l u m e. METHODOLOGY This study used four key components of the PM/CP tool: collection agents to observe and collect system data in an efficient manner; an analysis module to build queuing models from observed data according to workload definitions; a prediction engine to solve queuing models for perfo rmance para m e t e rs; and visualization tools to understand trends, correlate metrics, and produce graphical reports. In cre ating the perfo rmance models, it wa s n e c e s s a ry to understand the va rious perfo r- mance metrics for processes on the UNI systems. These measurements included types and volumes of the business functions and t ra n s a c t i o n s, time consumed per p ro c e s s, I/O counts, m e m o ry usage and netwo rk usage. The fo l l owing steps we re take n : 1. Data collectors were installed on the systems that are part of the Internet Banking environment. 2. Performance metrics were collected during production operations. 3. Collected data was analyzed into workload-driven queuing models. 4. A prediction engine was used to: evaluate the production server s performance; create calibrated system models representing actual resource utilizations; modify the number of users and transaction call counts to create specific workload scenarios; predict client/server resource requirements by increasing the number of projected users; and predict UNI server resource s e rvice and wait times by tra n s a c t i o n. 5. Model results were validated against external performance measurements. 6. What-if analyses were performed for system sizing and perfo rmance modeling. To present a comprehensive performance m o d e l, wo rkload ch a ra c t e ri z ation wa s employed. One of the key objectives was to derive a workload that could best represent the application in terms of response time (i.e., if the application takes an average of five seconds to complete a business function, the predicted model wo rkload re s p o n s e time should closely correlate to five). Performance Data Capturing For this study, p e r fo rmance data wa s c o llected 24 hours a day for several days. The collection agent runs as a daemon-like process to efficiently sample kernel data structures, /dev/kmem and /proc, for process and system information. System configurations are automatically recorded. Workload Characterization and Model Building The data analysis module simu l t a n e o u s- ly processes perfo rmance data for interconnected systems, and ap p o rtions user and process re s o u rce consumption into meaningful wo rkload definitions. It provides the fl exibility to iterat ive ly rev i s e wo rkload definitions after data collection. It allows perfo rmance data from one or m o re systems to be grouped according to meaningful cl i e n t / s e rver transactions and bu s i n e s s - o riented wo rkloads. With wo rkloads defined in a command fi l e, t h e a n a lysis module automat i c a l ly re d u c e s voluminous perfo rmance data and builds a queuing model that includes all nodes, d ev i c e s, and wo rkloads. Building the Transaction Profile The procedures to define the transaction profile include: 1. Fifty-five transaction classes were defined by grouping processes based upon their function. The default time of 1 msec was used as the initial transaction service time. 2. Sixteen workloads were defined by grouping the application transaction classes. Key workloads are shown in Figure The key process in this environment was sesserv running on the session host. The number of processes counted for sesserv is directly related to the number of active end users logged in to the Internet Banking environment. Figure 2 shows the characteristics of the sesssrv process. 4. Other Web-based processes are: ns-admin, ns-httpd, httpd and cgi. The performance characteristics of these processes are shown in Figure The sesssvr process count was used to compute the time per session for each method server. The service time for a method server is: (1) service time(method server) = total hourly sec (method server) sesssrv count N E T W O R K I N G The obbclient and obbagent processes, the main ORB client and server processes, are used in the model as the primary independent transactions. The transaction service times of all the method servers were a d j u s t e d, and wo rkloads we re defi n e d based upon the characteristics of the obbclient and obbagent as shown in Figure 4. Performance Analysis First, we needed to understand the load of the Internet Banking application. This was done by looking at the Transaction Rate of the workload Inetbank@ (which is defined in terms of the sesssrv process; see Figure 4). Figure 5 shows this workload s t ransaction rate on a typical day. As we can see, the ave rage nu m b e r of users during peak hours was about 250. One concern in this case was the high response time at the session host, fa i r way. Fi g u re 6 shows a response time well over an accep t able five seconds duri n g busy peri o d s. June 97 TECHNICAL SUPPORT

3 Figure 1: Workloads Workload Independent Transaction Classes Dependent Transaction Classes Inetbank obbclient webprocs Inetbank-c obbagent-c otserv, oninit system utility obbagent-t sysprocs utilprocs otserv, oninit Figure 3: Web Process Statistics For Feb-16-96, (Count of sesssrv = 243) Web Process Process Count Sec % ns-httpd httpd cgi Figure 4: Service Times for Key Workloads Workload Description Inetbank - on Transactions sesssrv (+dependent: webprocs Service Unit (per) trans.) 1 process Figure 2: Statistics for sesssrv Process Hour on Feb Total Average Count Total sec sec per process % Inetbank-c - on castle - on #Hr obbagent (+depentent: otservr, oninit) obbagent (+dependent: otservr, oninit) Figure 5: Internet Banking Transaction Rate (=# of end-users/hour) 1sec 1.2 sec 3.0 sec 2.0 sec 1 sec 1.2 sec 3.0 sec 2.0 sec 12am pm Workload Transaction Rate Workload Inetbank@ on 2/16/96 TECHNICAL SUPPORT June 97

4 Figure 6: Workload Time of Secs Time (secs) 12am pm Workload Time Workload Inetbank@ on 2/16/96 Figure 7: Measured vs. Projected Times Inetbank Workload Time - A computed value of the total time spent by a transaction in the system. It generally includes service time, (i.e., process execution time) wait time, I/O service time (time to do I/O), I/O wait time, and network service time (i.e., network transaction time, including bandwidth wait time, on the LAN. Transaction - A logical unit of work that consumes computing resources: cycle, I/O, and memory, at a specific UNI system. It does not necessarily correspond to an end-user banking transaction. Terminology Used by the PM/CP Tool Workload - An entity representing an application or part of an application. A workload can be made up of a set of transactions which are executed at different frequencies across one or more UNI systems. Throughput - The number of transactions per hour. If a workload executes multiple transactions, the reported value of throughput for a workload is the weighted average of its transactions throughput. 1. Different hours from different dates were selected to record the number of users, and the corresponding % and response time of the Inetbank workload. 2. The values from Step 1 were plotted against the projected values computed by the prediction model. See Figure 8. Performance Modeling A baseline prediction model was built with transaction types representing different business functions. The key attributes of these transactions include execution time and I/O counts. Modeling Different User Levels The baseline model was used to predict performance impacts of a growing user base. The results are summarized in Figure 8. Figure 7 shows the response time curve for the Inetbank Workload as the number of users increases. Note that at 300 users, is constrained. Overall workload response time degrades nonlinearly due to the increasing wait time at. Sensitivity Analysis and Model Validation To understand the validity of the model, the projected performance results were compared with measured statistics. Projected value Measured value Number of Users What-if Modeling A number of what-if scenarios were considered: user growth, two, three, four, five, 10 times; upgrade for the ORB servers; load balancing schemes between castle and ; adding a third ORB server; and improvement/degradation of service time for method servers. As an example, this article presents the modeling results of the upgrade scenario. Upgrade Modeling Figure 9 shows the capacity of the three systems during a typical day. Two scenarios were modeled based on an average load and a peak load, given the following upgrade proposal: castle and - upgraded from HP G30 to HP G70/02 (two-way); and - upgraded from DEC @ MHz to DEC @275 MHz. Scenario A: Average Load is shown in Figure 10. Note (*) that becomes saturated. The throughput cut-back takes place at 2,000 users. June 97 TECHNICAL SUPPORT

5 Figure 8: Performance Prediction Under User Growth Number of Users * % Figure 9: Utilization Time at Time at am pm Workload [INETBANK] on 2/16/96 Inetbank Time Throughput Inetbank@ Inetbank-c@ castle Scenario B: Peak Load is shown in Figure 11. Since it is the same model, becomes saturated at the same point: 2,000 users. With the HP G70/02, the ORB servers can sustain a load of 2,500 users each. The Web server,, however, can only support 1,500 users even with the DEC @275MHz. In order to support 5,000 active users by year end, the Web server needs to double its current capacity. MEETING THE PERFORMANCE CHALLENGE In this case, by establishing an appropriate system prediction model early in the lifecycle, the system performance manager and capacity planner were able to quickly gauge the impact of application changes and revised growth estimates, and provide critical and timely input for new acquisitions. ts REFERENCES BGS Systems, BEST/1 Performance Assurance for UNI User s Manual, BGS Systems (1995). Eric D. Ho is director of Presales Technical Support for BGS Systems, Waltham, Md. He has more than 13 years of experience in systems engineering. Eric has been with BGS Systems for seven years, specializing in network and systems performance analysis and capacity planning. Michael E. Matchett is a performance consultant with BGS Systems. He provides capacity planning studies of key business systems to clients, as well as assists in creating enterprise-wide performance management operations Technical Enterprises, Inc. Reprinted with permission of Technical Support magazine. For subscription information, mbrship@naspa.net or call , Ext Figure 10: Scenario A - Average Load Prediction Figure 11: Scenario B - Peak Load Prediction Load Inetbank@ 1x 2x 3x 4x 5x 10x Load Inetbank@ 1x 2x 3x 4x 5x Users % RT (sec) % RT (sec) % RT (sec) Users % RT (sec) % RT (sec) % RT (sec)