Compute Canada Resource Allocation Competition Bryan Caron

Size: px
Start display at page:

Download "Compute Canada Resource Allocation Competition Bryan Caron"

Transcription

1 Compute Canada Resource Allocation Competition 2015 October 1-2, 2014 Bryan Caron McGill University / Calcul Québec / Compute Canada Montréal, QC Canada

2 What is an Allocation? Default Allocations Resource Allocation Opportunities Competition 2015 Fast Track Research Platforms and Portals (RPP) RAC Competition 2015 Outline 2

3 What is an allocation? an allocation is a target level of utilization for a research group (and the group member users) that the scheduler attempts to deliver on average over time an allocation represents some portion of the overall cluster resource, measured in units of core-years it is a target value for usage in the calculation of job priorities it is not an upper limit on accumulated usage it is not a guaranteed amount of resources it is not an amount of resources available instantaneously and/or continuously 3

4 Time on the Compute Canada systems are managed through batch processing systems that schedule the submitted workloads based upon a priority mechanism The overall facility scheduling policy is there to ensure fairness users have access to their allocations (Fairshare) avoid having groups excessively monopolize the resources 4

5 The scheduler system of a cluster continually calculates the priority of all submitted The job scheduling policy of the facility is applied through a variety of parameters that are used in calculating job priority the requested resources of a job walltime number of cores amount of memory, the time spent waiting in the queue the fairshare usage relative to the allocation of the research group (RAPid) to which the user submitting the job belongs 5

6 What is fairshare? Fairshare is the portion of the job priority calculation that takes into consideration the recent historical usage of the group relative to the the target level (allocation) If prior usage is below the target level, the job priority is increased If prior usage is above the target level, the job priority is decreased the impact of historical usage is weighted, such that yesterday s usage is more important than usage from 3 weeks ago 6

7 Calcul Québec Default Allocations for 2015 Guillimin: 30 core-years Colosse: 30 core-years Briaree: 30 core-years MP2: 80 core-years (30 core-years in 2014) MS2, Cottos, Psi: 15 core-years For all systems including outside of Calcul Québec: 7

8 Resource Allocation Opportunities Competition 2015 Announced September 15 Three categories: Fast Track Resource Allocations Competition (RAC) Research Platforms and Portals (RPP) 8

9 Fast Track By invitation only Target community: existing 2014 RAC users with minimal changes expected for 2015 simplified application process compared to full RAC request Deadline: October 2,

10 Research Platforms and Portals (RPP) ** New! ** Application category examples: Resources for larger communities of researchers Applications that provide a public platform using CC computing or storage Groups with international agreements for multi-year computing or storage commitments Groups providing shared datasets accessible using non-compute Canada interfaces / portals Timelines Letter of Intent due September 25 Selected projects invited for full application Oct 3 Full proposals due October 20 10

11 Resource Allocations Competition (RAC) For requests larger than a default allocation default allocation sizes are variable between systems and sites Allocation duration: 1 year starting Jan 2015 Application Deadline: October 20,

12 Resource Allocation Competition (RAC) All applicants are advised to contact CC staff prior to submitting an application and no later than Oct 1st All new applicants MUST contact CC staff Please contact us at to discuss your proposals Further information: General Inquiries about the resource opportunities: 12

13 Resource Allocation Competition (RAC) Why Apply? It provides prioritized access to resources for your research general guideline: If your resource needs are about double that of the default amount, a RAC application is required provides a clear demonstration of the benefits delivered by advanced research computing systems to Canadian researchers 13

14 Resource Allocation Competition (RAC) Eligibility applicants can be researchers from Canadian academic institutions who are eligible to apply for funding from any of the granting agencies regular faculty members including adjuncts not postdoctoral fellows, graduate students a lead PI (principal investigator) cannot submit more than one application but can be a participant in other submissions PI can delegate ability to create RAC application but must be the person performing the final submission 14

15 RAC Guidelines Consultation highly recommended to contact Compute Canada technical staff for the systems you intend / would like to use to ensure the technical aspects of the proposals match well the systems Submission through the Compute Canada Database (CCDB) basic information (title, summary, CV, etc) Research and Technical Justification (see Template) details of previous allocation requests and usage resource request specifications on the available systems 15

16 Evaluation Process and Criteria Evaluated for both technical feasibility and scientific excellence Technical Review in order to ensure that the CC resources will be used appropriately and efficiently performed by Compute Canada staff either from the site from where the resource was requested or from a staff member that has good familiarity with the technical details of the project Expert Review Committee followed by Chairs Committee review 16

17 Evaluation Process and Criteria Expert Review Committees Astro and Subatomic Physics Bioinformatics, Neuroscience and Medical Imaging Chemistry, Biochemistry, Biophysics Earth and Environment Engineering, Mathematical and Computer Sciences Humanities and Social Science Nano, Materials and Condensed Matter 17

18 Evaluation Process and Criteria Quality of the science originality, innovation significant and expected contributions to research clarity and scope of objectives feasibility,... Quality of the research team knowledge, expertise and experience quality of contributions to and impact on research HQP (Highly Qualified Personnel) Training quality of HQP contributions impact on participation HQP number of HQP engaged in project potential cross-pollination between disciplines of HQP 18

19 Application Submission Template Introduction to Research Problem Technical Justification Compute Requests Code Details Code Performance and Utilization Size of Request Impact of Cut Storage Requests Storage Details Storage Performance and Utilization Size of Request Impact of Cut Progress over past year 19

20 Technical Justification Technical details of your computational and/or storage needs to help ensure cycles and storage are used as efficiently as possible requirements are estimated reasonably ensure that appropriate systems are being used typically 1-2 pages in length can be longer for projects with more complicated code or storage requirements 20

21 Code Details code names, reference publications,... serial or parallel? type of parallelism? private code? open source? commercial? license requirements? Compute Requests 21

22 Compute Requests Code Performance and Utilization e.g. how many iterations/timestamps/flops per hour of wall-time how was this estimated/measured? appropriate systems or cpu architectures? number and size of generated files? is output temporary? for longer storage? parallel codes scaling efficiency (required for 256 cores or more) checkpointable? typical size of jobs to be run 22

23 Compute Requests Size of Request Explain how you estimated the total amount of compute time requested explain why you allocation is requested in addition to other resources you may have 23

24 Compute Requests Example We expect to carry out 20 sets of simulations, each of which includes four simulations at different resolutions performed using different numbers of cores 64, 128, 256 and 512 cores Each of the four simulations are expected to require cpu usage amounts , 96000, and core-hours Conducting 20 such simulations of each is estimated to require 7.2M core-hours (822 core-years) 24

25 Compute Requests Example Time of Calculation (Hours) Theoretical scaling Measured scaling # of cores 25

26 Compute Requests Size of Request Explain how you estimated the total amount of compute time requested explain why you allocation is requested in addition to other resources you may have Impact of Cut What would be the effect on the project s research goals if the compute cycles request were to be cut by a) 25% and b) 50%? 26

27 Storage Requests Storage Details Why is a storage allocation required instead of making use of scratch or other similar types of temporary spaces accessible by running jobs? is the storage for hosting codes and data files? Other purposes? databases, web-access, storage for multiple sites/systems? what approximate number of files? size distribution? Does storage need to be accessed by running jobs? Can it be on a remote server? is it the only copy of the data? does it need to be backed up? what would be required to regenerate? 27

28 Storage Requests Storage Performance and Utilization Will any aspects of the storage request vary as a function of time? is all of request needed in January? Will allocation grow / vary during the year? Will the storage allocation need to persist more than one year? Performance Requirements is bandwidth and i/o performance critical to the project? What are the estimated i/o and iops required and why? 28

29 Storage Requests Size of Request Explain how you estimated the amount of storage required? What other storage systems you have access to? Why is allocation needed? Impact of Cut What is impact on the project s research goals if the storage request were to be cut by a) 25% and b) 50%? 29

30 Technical Justification Progress over the Past Year Did you have a RAC allocation in 2014? What progress did you make as a result of your allocation? Research results? Utilization levels from 2014 Did you achieve what you expected? 30

31 Summary Please feel free to contact us for any discussions regarding the RAC process and your application we would be happy to review your draft application prior to submission and give you our comments and suggestions please contact us as soon as possible to allow sufficient time to have your questions answered and for use to provide feedback Contact: 31