ISILON ONEFS CLUSTER OPTIMIZATION

Size: px

Start display at page:

Download "ISILON ONEFS CLUSTER OPTIMIZATION"

Osborn Hoover
5 years ago
Views:

1 ISILON ONEFS CLUSTER OPTIMIZATION THROUGH CAPACITY MANAGEMENT Abstract Isilon scale-out network-attached storage (NAS) is capable of scaling to 144 nodes and 50 PB of unstructured data in a single, space-sharing namespace. Many customers seek guidance on balancing performance against full utilization of cluster capacity. This paper investigates the effects of capacity utilization on performance. September 25, 2017

2 Copyright 2017 Dell Inc. or its subsidiaries. All rights reserved. April 2017 Dell believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS-IS. DELL MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. USE, COPYING, AND DISTRIBUTION OF ANY DELL SOFTWARE DESCRIBED IN THIS PUBLICATION REQUIRES AN APPLICABLE SOFTWARE LICENSE. Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be the property of their respective owners. Published in the USA. EMC Corporation Hopkinton, Massachusetts In North America Isilon OneFS Cluster Optimization through Capacity Management 2

3 Contents Introduction... 1 What is Performance and Optimization... 2 Does Capacity Influence Data Delivery?... 2 Investigate Capacity Influence... 3 Test Results Light Workload... 4 Conclusion:... 6 Test Results Heavy Workload... 6 Zones of Capacity Utilization... 9 Capacity in use (Green Zone) File system space requirement (Green Zone) Maintenance Requirement Reserve Capacity (Yellow Zone) Disks almost Full Zone (DaFZ) Capacity Effect Capacity Effect Larger Disks Capacity Effect Faster Disks Capacity Effect Disk Rebuild Optimizing Data Delivery through Capacity Utilization Summary Isilon OneFS Cluster Optimization through Capacity Management 3

4 Introduction An Isilon OneFS cluster can deliver petabytes of capacity with multi-protocol and mixed workload consolidation benefits that make it a significant business asset. Optimal management of this storage asset will result in: STABILITY: accommodates workload burstiness without performance impact CONTINUITY: offers a business acceptable window of risk during maintenance activity LATENCY: assures acceptable user response time THROUGHPUT: meets expected throughput numbers DEPENDABILITY: allows jobs to finish in the allotted time RELIABILITY: causes no escalations that divert staff from other projects To achieve these business goals, storage administrators must understand the principles of performance variation across ranges of capacity utilization. Figure 1 summarizes conclusions from tests discussed in this article. In the model, capacity-based zones of performance behavior are colored as follows: Green where response time is consistent Yellow where performance varies greatly depending on I/O rate Red where at any I/O rate performance is likely to be unacceptable Figure 1: Capacity Zones and Performance Variation Though workload characteristics vary widely, the tests discussed in this paper show that these results are equally applicable across a spectrum of read/write ratios as well as random and sequential access patterns. By examining service time, disk transfers and throughput at low and high workload levels as capacity utilization approaches 100%, the cause of capacity-based performance zones is revealed. Evidence shows that when space is scarce, data placement requires additional intra-file disk seeking. The rate of additional intra-file activity increases as space becomes scarcer, culminating in the red zone. Other results from modeling and testing discussed in this paper are: Larger hard disk drives (HDDs) must have lower operations at the same capacity point OR have lower in use capacity at the same operation rate compared to smaller drives. Faster HDDs can sustain more operations at a higher capacity. The challenge is using fast HDD benefits while not attempting too high an I/O rate or capacity. HDD rebuild task at 95% in-use capacity experiences write operations two to three times longer than a HDD rebuild at 80% in-use capacity. Managing in-use capacity on an Isilon cluster is an effective way to optimize cluster data delivery goals. The following pages contain details of testing for capacity effects, and the results from which the above statements are derived. Isilon OneFS Cluster Optimization through Capacity Management 1

5 What is Performance and Optimization For this paper, performance is defined as predictable data delivery within variance limits that avoids lost production or adds additional production costs. Optimization is deemed to be actions to improve or restore data delivery. Metrics for data delivery include: Response time (ms). (Used in this document.) Throughput (MB/s) Length of job (h:ms) Figure 2 brings these ideas together, showing a storage subsystem performance (data delivery) curve where the I/O operation load is intensified until the response time increases exponentially. The chart is representative of any storage subsystem experiencing an increasing operation rate to the point of resource saturation. Marked in green is the acceptable data delivery envelope. By definition, any sustained response time outside the green envelope incurs costs to the business, in one form or another. Figure 2: Acceptable variance shown on storage array operations versus response time curve Does Capacity Influence Data Delivery? A simple test was conducted to determine if capacity utilization does influence data delivery. The same synthetic workload was run on the same test cluster at each of the capacity points listed below. 80% 90% 95% 97% 99% The workload characteristics were unchanged, only capacity varied. Changes to the data delivery curve for each run can only be attributed to the effect of capacity. The results are presented in Figure 3. Isilon OneFS Cluster Optimization through Capacity Management 2

6 After establishing that in-use capacity influenced data delivery curves, one additional test run was performed. The synthetic workload operation mix was edited in the following manner to produce a read only (RO) workload as closely resembling the original workload as possible, and that workload was run at the 97% capacity utilization point: Write to file operations reduced from 10% down to 0% Read from file operation raised from 18% to 30% Metadata operations changed to read only procedures The results of this additional run are also presented in Figure 3. Figure 3: Capacity effect on data delivery In the above chart, the different curves produced when altering in-use capacity clearly shows that capacity utilization does have an effect on data delivery. The cause and size of that effect is unknown and will be investigated next in this paper. However the following conclusions can be drawn from the graph: When not under heavy load, response time is similar at any capacity level (The flat area of the graph) Under load, higher capacity utilization results in higher response time (Or, under load, the lowest capacity utilization has the lowest response time curve) Under load, a read only workload performs better than a read and write workload (In this example read only at 97% capacity had a similar data delivery curve as 90% capacity read and write (RW).) Don t go near 99% full, the decline in operations between 97% and 99% is extreme. (The reason for this behavior is discussed in the following pages.) Investigate Capacity Influence In order to identify root causes of the capacity effect, further testing was done with these constraints: Cluster containing HDD only Capacity points 70%, 80%, 85%, 90%, 95%, 97%, 99% cluster full Isilon OneFS Cluster Optimization through Capacity Management 3

7 Test using a light workload of constant operation rate. Allows investigation of capacity effects in the flat area of Figure 3 Test using a heavy workload of constant operation rate. Allows investigation of capacity effects in the under load area of Figure 3 where the curves increase greatly Draw conclusions on capacity effect What about larger disks? What about faster disks? Examine disk rebuild at 80% and 95% cluster capacity utilized The test workload was composed of: Meta data access: GetAttr 15% R/W various ratios 82% Space operations (create) 3% Read versus Read/Write had an influence in an earlier test, and random versus sequential access were also thought to be candidates for influencing data delivery, so at ALL capacity points both the light and heavy workloads were tested with the following R/W ratios, with both random access then sequential access. The legend on the charts below captures this mix as: Random access, R/W ratios: 70/30, 80/20, 85/15, 90/10, 95/5 Sequential access, R/W ratios: 70/30, 80/20, 85/15, 90/10, 95/5 Test Results Light Workload Because the workload is so light, conclusions drawn from these test results are expected to be valid predictors of space-related behavior regardless of cluster performance configuration such as cache size, SSD and disk type. Results are measurements of CPU Utilization, Disk Transfers, Throughput (total), and Service Time of create operations. Figure 4 below shows CPU resources at 10%, indicating a light workload. CPU becomes slightly busier after 95% capacity used. Figure 4: Light workload, CPU utilization with increasing capacity utilization Figure 5 shows the behavior of disk transfers occurring in the test workload as free space is reduced. This should be interpreted in conjunction with the throughput resulting from those disk transfers. The throughput is shown in Figure 6. The following observations are made from those two charts: Isilon OneFS Cluster Optimization through Capacity Management 4

8 Random access incurred less disk transfers for the same throughput as less space became available. This behaviour changed at 95% capacity utilization with the disk transfer rate increasing for the same throughput. Sequential access incurred more disk transfers for the same throughput as less space became available. This behaviour changed at 95% capacity utilization with the disk transfer rate increasing and throughput increasing as well. Figure 5: Light workload, OneFS disk transfers per second with increasing capacity utilization Figure 6: Light workload, throughput with increasing capacity utilization From the above charts, it is clear for a light workload, disk transfer behavior changes as free space is reduced, even though throughput remains unchanged until 95% in-use capacity. Service time of create operations is examined next. Create operations were only 3% of the synthetic workload but create is a lengthy operation that was thought to best reflect changes resulting from disk transfer and throughput activity. Isilon OneFS Cluster Optimization through Capacity Management 5

9 Figure 7: Light workload, service time with increasing capacity utilization Note: The Figure 7 Y axis (Service Time) value is not displayed. In this analysis, the focus is on change in service time by capacity, not the service time value as a performance measure. The following observations are made from the chart in Figure 7: Service times for create tasks with sequential access are impacted by lack of free space at the 95% full mark. Service times for create tasks with random access don t see an increase until 97% capacity full. Conclusion At some high-capacity utilization (around 95% in the above test results), the amount of disk activity increases. Since disk is the slowest component of the storage subsystem, service times for space allocations (create operations) also increase.the increase in disk transfers for the same workload going to less free space is attributed to file layout requiring additional non-consecutive block allocations. More intra-file seeks are then required to read each file that was allocated less optimally than when more free space was available. When the capacity effect occurs, it results in greater service time, which appears to result from busier disks, due to additional intra-file seeking. Test Results Heavy Workload The original test suite was rerun with an additional heavy workload component, to bring resource busy considerations, such as disk queuing, into the analysis. The same metrics were collected. Averages of previous measurements are shown on the charts to indicate the relative change in results. As before, these metrics are expected to be valid indicators of capacity-related behavior. The heavy workload was targeted to use over 60% CPU utilization. This workload increased utilization of all storage system resources. The results are not aimed at characterizing performance but rather at highlighting capacity-based behavior. Isilon OneFS Cluster Optimization through Capacity Management 6

10 Figure 8: Heavy workload, CPU utilization with increasing capacity utilization In Figure 8 with the heavy workload, CPU utilization shows three distinct segments, starting above 60% dropping to 40% for most of the test suite but declining again to a low of 25%. Average values from the light workload results are also shown. Figure 9: Heavy workload, OneFS disk transfers per second, increasing capacity utilization Figure 9 shows disk transfers display the same three measurement phases. The disk transfer count is naturally much higher than the light workload, but it also shows more variation and lacks the trends noted in the light workload analysis. Isilon OneFS Cluster Optimization through Capacity Management 7

11 Figure 10: Heavy workload, throughput with increasing capacity utilization Random and sequential throughput show the same segmentation in the test results. (Figure 10) Figure 11: Heavy workload, service time with increasing capacity utilization Note: The Figure 11 Y axis (Service time) scale is not displayed. In this analysis, only change in service time by capacity is presented. Service time results also fall into three segments. The middle section, shows large variation as capacity increases (free space decreases). Heavy workload service times for space allocation tasks shown in Figure 11 are likely to reach user unacceptable levels. These results are interpreted based on the three distinct sections observed in the graphs: Approaching 80% capacity 80% capacity 96% capacity 97% capacity onwards Approaching 80% Capacity in use Isilon OneFS Cluster Optimization through Capacity Management 8

12 At 70% of capacity in use (assume all points below), CPU utilization is at its highest for this workload. As in-use capacity approaches 80%, there is a decline in CPU utilization and an increase in disk access. As identified in the light workload analysis, disk access becomes more frequent with more intra-file seeks from suboptimal layout. (Disks are getting full.) Service time for space consuming operations is steady through the range of 70% to 80% capacity. The observed behavior is evidence that as capacity use increases beyond the 70% level with a heavy workload: Disks becoming full cause additional disk activity through suboptimal layout. (Busier disk drives introduce additional queuing.) Cluster CPU utilization declines. (The system waits for disk responses.) 80% to 96% Capacity From 80 to 96% in-use capacity, CPU utilization and disk access gradually decline, however service times for create operations increase rapidly and vary widely. With the heavy workload and additional intra-file activity from disks being full, busier disks result in longer wait times and fewer inflight operations. The heavy workload at 70% capacity did not stress the system, but with disks getting full and added intra-file activity, the exact same workload at 80% capacity does introduce performance impact. Service times vary widely but trend strongly upwards as in-use capacity goes from 80% and approaches 96%. 97% Capacity onwards Above 97% in-use capacity results in a rapid decline in CPU utilization and disk access. Along with the reduced use of these resources, service time for space allocation tasks continues to increase. The explanation for reduced disk access counts would be intense disk queueing and delayed responses with the disk 100% busy. The heavy workload test (60% CPU) greatly increased resource utilization over the light workload. The previously identified capacity-based, intra-file disk activity was present but was far more impactful. As capacity utilization increased, impact to service time became more variable but was always trending upward. Three capacity-based areas or zones of behavior were identified by the test results. Zones of Capacity Utilization Figure 12Figure 12 below heat maps (green, yellow and red) the capacity utilization zones identified in the heavy workload test results. A detailed description of elements in the figure, and the proposed operational use of each zone, is discussed below. Isilon OneFS Cluster Optimization through Capacity Management 9

13 Figure 12: Elements and zones of capacity utilization Capacity in use (Green Zone) The green zone represents an ideal work area for the file system. Space is efficiently obtained when required. An Isilon cluster (group of node pools) offers an incredibly large, single namespace green zone, but it cannot make the green zone limitless. Capacity consumption increases over time and eventually green zone capacity is fully allocated. If left unchecked, limitations inherent in disk Block Allocation Manager (BAM) algorithms lead to additional intra-file disk activity marking the end of the green zone. File system space requirement (Green Zone) The File System Requirements element of node pool capacity utilization is a subset of actual capacity in use. It is called out separately in this discussion to promote awareness that storing and managing files, as well as replicating them locally and remotely, introduces overhead on storage capacity. File metadata, FSA database, and information for Quota management are examples of features that consume file system space in proportion to the amount of files managed. Storage for these overheads should be recognized as part of node pool capacity, and a realistic expectation set that total file system capacity is reduced by the presence of these overheads. Snapshot and SyncIQ delta sets are also considered to be within the file system used space number. These delta sets consume space and are dependent on the number of snapshots, the rate of data change, and the amount of protected files. Administrators should be cognizant of this space consumption, however within this discussion only a small part of node pool capacity is modeled for metadata and snapshots. Maintenance Requirement Reserve Capacity (Yellow Zone) The yellow zone represents storage capacity held in reserve as a management tool for the preservation of business objectives during component failures and expediting return to normal operation. Dell EMC recommends maintaining Isilon OneFS Cluster Optimization through Capacity Management 10

14 reserve capacity in each node pool based on the chosen pool protection level, aligning reserve capacity with an existing business decision. Operation of the cluster within the yellow zone reserve capacity will lead to higher variance in data delivery response time. We do not recommend that you conduct normal operation of the cluster in the yellow zone. However, during degraded states, yellow zone reserve capacity preserves enough space to optimize recovery processes and return the node pool to the intended data protection in the shortest possible time. Without capacity held in reserve, component failures within a node pool will lead to elongated recovery time and performance impacts. In the yellow zone, service time for space operations like file creation, writes, and deletes is expected to show variation and trend upwards as capacity utilization increases. The variance at the beginning of the yellow zone is significantly less than variance at the end of the yellow zone. Disks almost Full Zone (DaFZ) When disk drives are not full and sufficient file system free space exists, files can be written and retrieved efficiently without extra intra-file seeks per request. When a file system is almost full, the disk drives in that system are almost full and reading/writing files (especially large files) will be slower, requiring additional intra-file disk activity to place or retrieve a file stored in more chunks. Capacity Effect Result: Higher and more varied response time when in-use space is above 80%, response time and variation increase as free space is reduced When: Light workload saw an effect at 95% cluster full Heavy workload shows zones of capacity effect Up to 80% cluster full 80% to 97% Above 97% How: More intra-file disk transfers. To fit allocation rules with available space, writes are broken into smaller ranges, resulting in more disk transfers. Subsequent reads to more scattered file chunks also add work for disks, the slowest component of the storage system. When the cluster is under load, the capacity effect adds to already busy resource queues. Figure 13 is a visualization of the results discussed above. It also ties in the flat and curved areas of the Figure 2 graph. Isilon OneFS Cluster Optimization through Capacity Management 11

15 Figure 13: When of capacity effect Isilon OneFS Cluster Optimization through Capacity Management 12

16 Capacity Effect Larger Disks One obvious question is how does this apply to larger HDDs, 8TB and 10TB for example, where 20% capacity left unused (creating a storage pool with 20% free space) is a lot more storage than resulted with the smaller drives used for testing? On such large drives and future even larger drives, should the BAM be able to find space even at high capacity levels? Yes, but these larger capacity drives also have a higher access density, where access density is I/O per second per gigabyte. Higher access density means increased contention on fewer disk spindles storing more gigabytes. The result is response time increases at a lower operation rate. The implication of large drive access density on the capacity effect is illustrated in Figure 14 where the curves are moved to the left: The red vertical line indicates small drive operations within response time limits with lower access density. The response time curves for lower access density are not shown. The green vertical line indicates large drive operations within response time limits with the curves left shifted in response to large drive higher access density. The difference between these two lines intersection with the capacity curves shows that to work within the acceptable data delivery envelope, larger drives must have less operations at a given capacity point (marked 1 on the chart with a left arrow for less operations) OR have lower in-use capacity at the same operation rate compared to smaller drives (marked 2 on the chart with a down arrow for lower capacity required to meet the data delivery envelope.) Figure 14: Large disk access density, operations, and capacity Isilon OneFS Cluster Optimization through Capacity Management 13

17 Capacity Effect Faster Disks How does the capacity effect work with faster but similar-sized drives to those in the tests? Faster HDDs (higher transfer rate or lower seek time) allow increased operations before the response time curve responds exponentially. In Figure 15, the flat area of the graph is longer, and the exponential increase in the curves is further to the right. Equivalent intra-file seek activity occurs as with same-sized but slower disks, but each intra-file seek has less impact even at a higher operation rate. The danger is expectations of VERY high operations and high capacity at the same time. In Figure 15: The green vertical line indicates slow drive operations within response time limits. Not shown are the leftshifted response time curves that identify that operating point. With faster drives, the response time curves shift to the right, identifying a range of operation versus capacity solutions that stay within the data deliver environment. (Two green arrows.) The capacity effect is just as relevant for faster drives. Understanding that faster drives can benefit sustainable operations and capacity, the challenge becomes successfully choosing a data delivery option that uses the fast drive benefits but does not slip outside the envelope of acceptance. Operations versus capacity solutions outside the acceptable data delivery envelope are shown by the red shaded area. Figure 15: Faster disks, operations, and capacity Capacity Effect Disk Rebuild One more useful data point to consider is capacity implications on HDD rebuild times. Since disk rebuilds are the most common maintenance activity, this situation will occur for every customer. Isilon OneFS Cluster Optimization through Capacity Management 14

18 The following test scenario was used to measure write and create operation service times during two disk rebuilds: Same heavy workload used in previous tests Baseline measurements at 80% capacity in-use Disk rebuild initiated at 80% capacity in-use Rebuild same disk initiated at 95% capacity in-use 10% Virtual Hot Spare space Figure 16: Disk rebuild, capacity versus write and create operation service time The test results shown in Figure 16 indicate that the capacity effect causes a disk rebuild occurring with the same workload at 95% capacity compared to 80% will incur: 10 times increase in service time for create operations 2-3 times increase in service time for write operations Virtual Hot Spare (VHS) space was reserved, but that space still has to operate at the current capacity point, making VHS just as susceptible to the capacity effect as all other file system space. Optimizing Data Delivery through Capacity Utilization Results from testing at various cluster capacity points were used to identify and outline the principles behind the behavior termed the capacity effect. These principles will apply to all customer workloads accessing hard drives Isilon OneFS Cluster Optimization through Capacity Management 15

19 within storage arrays. (This analysis does not cover Flash drives.) The exact operation limit and capacity points to meet each customer envelope of acceptable data delivery cannot be precisely stated, but it is understood that predictable data delivery is valuable to any business, delivering: Acceptable user response times Expected throughput numbers Jobs finishing in the allotted time No escalations that divert staff from other projects To achieve predictable data delivery and the listed benefits, the capacity effect must be understood and capacity utilization monitored and managed. Summary To risk manage the possibility of adverse data delivery at very high capacity levels, Dell EMC recommends a maximum of 90% capacity utilization, also stated as a Reserve Capacity of 10%. This recommendation is an estimate attempting to balance capacity, workload, and degraded-state recovery for the largest proportion of customers. However, from the test results examined in this paper, that recommendation can be complemented with the following best practices to optimize cluster data delivery through capacity management: Best Practices At any workload level, don t go above 97% capacity utilization. Examine data delivery variance at 80% and again at 85% capacity utilization; evaluate consistency of data delivery before using additional capacity. Consider a buffer for maintenance actions (disk rebuild) when planning reserve capacity. Isilon OneFS Cluster Optimization through Capacity Management 16