Pure Data System for Analytics Growth on Demand (GoD)

Size: px
Start display at page:

Download "Pure Data System for Analytics Growth on Demand (GoD)"

Transcription

1 IBM Analytics Platform Services Pure Data System for Analytics Growth on Demand (GoD) 1. How GoD works? GoD is built on the workload management mechanism, which allows it to work in concert with your other resource allocation parameters. It is based on how much resources are used over the horizon (default 3600 seconds) and workload management (WLM) balances the allocation of the system based on system limit over period of horizon. 2. Does GoD model any existing hardware? GoD relies on adding penalties for any perceived resource over-consumption. If resources are equally dimensioned between models than the bigger 50% will actually be slower than the smaller. In other words, a 4 Rack GoD does not emulate a 2 Rack Mako. 3. Does GoD affect queries equally and proportionally? No, GoD does not affect individual queries equally and proportionally. In fact, this is one of biggest misinterpretation of how GoD works. GoD relies on a feature called GRA (guaranteed resource allocation). The GRA scheduler does not guarantee instantaneous fairness. Rather, it guarantees to distribute (and hold back) resources over the period of horizon (default 1 hour [3600s]). Hence, the success or failure of GoD cannot be measured by looking at the execution or delay of individual queries. IBM Analytics Platform 1

2 Best Practices for GoD machines. 1. Appropriate resource groups creation: One should create appropriate resource group s for their specific jobs such as (ETL s jobs, Reporting jobs, Ad-hoc users, etc.). The reason is because WLM have to be fair and appropriate to each resources groups based on what resources have been allocated, how much that group have used and how much penalty gets applied. 2. ETL / Load jobs should have independent group: GoD feature works by adding delays to each queries. For loads, however, that it not possible which is why there is another feature, GRA Load Modelling. Therefore, 100% we advise our customers to split loads and queries into separate resource sharing groups for better scheduling (and thus higher accuracy), but that's even more important for GoD. Hence, if the workload consists of loads and queries we recommend to split it into different RSG(s) 3. Appropriate MIN / MAX resource allocation: We have seen may times that resource groups stay constant throughout the day or even not appropriately allocated (For example, a resource group given min of 30%, where as its actual usage is 40% or vice-versa). As best practice, these allocations should change based on priority / importance of job(s). One can utilize _v_shed_gra table (which holds about 2 days of data) to see TARGET_RSG_PCT v/s ACTUAL_RSG_PCT through the day and adjust % allocation accordingly. See sample chart for reference. In example above, it elaborates the case where a resource group have been allocated far less than what it actually uses. In GoD systems, WLM will start applying more penalty for this group because of the difference between target v/s actual. IBM Analytics Platform 2

3 4. Use of Schedulers Rules: One can use Schedulers rules to divert short queries / critical jobs / load jobs into higher % resource groups. Some of the sample scheduler rules are: Example 1: Creating a scheduler rule to process short queries (estimate seconds of less than 1 and execute them in high priority group) CREATE OR REPLACE SCHEDULER RULE CRITRULE AS IF PRIORITY IS HIGH ESTIMATE < 1 THEN EXECUTE AS RESOURCEGROUP RTL_PROD_REPORTING_URGENT; Example 2: Creating a scheduler rule to process load jobs / queries into separate resource group. CREATE OR REPLACE SCHEDULER RULE MOVE_THE_LOADS INCLUDING ADMIN AS IF TYPE IS LOAD THEN EXECUTE AS RESOURCEGROUP RSG_LOAD_GROUP; Example 3: Creating a scheduler rule to move specific user to high_critical_rsg. CREATE OR REPLACE SCHEDULER RULE SET_USERQUERY_CRITICAL AS IF USER IS VVIP THEN -- VVIP is actual database user EXECUTE AS RESOURCEGROUP HIGH_CRITICAL_RSG; (Note for more details / examples, please refer to system admin guide). 5. Keep consistent check on resource Intensive long running queries: Many times long running queries are resource intensive. This causes other queries to repay debt on GoD machines causing jobs to take un-expectedly longer. One should utilize available tools such as nz_query_stats to identify long running queries and work with users / developers group how such long running queries can be made efficient. 6. Setup runaway query event: We would recommend to use runaway query event to notify of any long running queries so DBA / system admin can take appropriate action for such long running query(s). IBM Analytics Platform 3

4 Frequently Asked Questions 1. How do I benefit by buying a 50% capacity two rack versus just a single rack system? You pay an initial price with a modest premium but have built-in room for growth as you load greater data volumes or add larger number of users doing queries or complex analytics 2. Why should I purchase a full capacity system instead of 50% of the next larger model? If you can forecast your data volume and performance needs over several years, then it would be less expensive to buy full capacity now instead of a minimum capacity system 3. If I run a system at 50% will every query take twice as long? No this is not the case and is a common misunderstanding of how Growth on Demand Work. GoD is built on the workload management mechanism, which allows it to work in concert with your other resource allocation parameters. When set at 50% a large aggregate workload will take twice as long ON AVERAGE on a GoD system then it otherwise would. Any individual query time could vary significantly and could run much slower than 2X or could even run faster in some instances. 4. If I run a query on a 50% GoD system and also run it on a similar system not running GoD, why doesn t it run twice as long? The GoD mechanism is built on the Guaranteed Resource Allocation GRA) facility. This allows it to work in concert with your other resource settings. What GRA does is to examine work that is running over time and provide extra resources to members groups that have not been getting their fair share. This means that if a query comes in from a group that has gotten most of the systems resources it will be delayed significantly to balance things out. Likewise if a query comes in from a group that has not had much activity on the system, it will be granted extra resources and will complete more quickly. 5. If I don t use Guaranteed Resource Allocation at all and just set system limit will all of my queries run in twice the time on a 50% GoD system. It is not recommended to use default GRA groups (i.e ADMIN and PUBLIC resource groups) as it will likely lead to undesirable results. Admin and Public groups have their own particular behaviors with Admin bypassing GRA and Public getting especially low priorities. Relying on system limit exclusively would still imply the creation of at least one group that is neither Admin nor Public. Queries run by members of that IBM Analytics Platform 4

5 group may still experience some variation in run times due do admin and public activity on the system. 6. How do I ensure we remain in compliance with the license terms? The system provides tools for measuring the used disk storage and setting system resource limits for performance. You will then have to either setup for automated reporting or use provided instructions for monthly reporting. Your technical sales engineer will assist with using the tools. 7. Who do I call if I have questions? Contact your IBM Account Team. 8. How is Growth on Demand configured? The system.systemlimit configuration parameter is the percentage of processing resources that are to be made available to all system users. 9. How is Growth on Demand enforced? Enforcement is via contractual commitment for: - Audits conducted by IBM - Manual/automated reporting of system.systemlimit and Storage usage Abbreviations Used GoD: Growth on Demand. GRA: Guaranteed Resource Allocation. WLM: Work Load Management. IBM Analytics Platform 5