Pre-announcement of Upcoming Procurement, AC2018, at National Supercomputing Centre at Linköping University

Size: px

Start display at page:

Download "Pre-announcement of Upcoming Procurement, AC2018, at National Supercomputing Centre at Linköping University"

Erick Robbins
5 years ago
Views:

1 Pre-announcement of Upcoming Procurement, AC2018, at National Supercomputing Centre at Linköping University Abstract Linköpings universitet hereby announces the opportunity to participate in an request for interest (RFI) in an upcoming procurement of a new High Performance Computing Facility for National Supercomputer Centre at Linköping University, Sweden Introduction The National Supercomputer Centre at Linköpings universitet will during 2017 procure a High Performance Computing (HPC) resource to be installed in 2018 general purpose academic compute (AC2018). The procurement will be carried out using negotiated procedure. The evaluation will be based on performance over lifetime cost for the systems and the performance will be determined using benchmarks. The intention with this pre-announcement is to allow vendors to familiarise themselves with the codes and relevant benchmark inputs, as well as to afford vendors the opportunity to respond to an RFI published along this pre-announcement. Vendors are encouraged to have a look and respond the RFI, but this is not mandatory to be part of the tendering. About the Customer Linköpings universitet (LiU) is a research-based university with excellence in education. LiU is a multi-faculty university where research and education are 1

2 equally important. LiU has 27,000 undergraduate students, 3,900 staff and faculty members and a turnover of 3,300 MSEK. LiU hosts one of the major academic High Performance Computing (HPC) centres in Sweden, National Supercomputer Centre (NSC, NSC is a national supercomputing centre within the Swedish National Infrastructure for Computing (SNIC). Current staff is around 30 people. NSC has served the Swedish academic community since 1989 as a provider of leading-edge supercomputing resources and storage services to members of academic institutions throughout Sweden as well as to NSC partners SMHI (Swedish Meteorological and Hydrological Institute), MET Norway, and SAAB. NSC owns and operates a number of large-scale compute and storage resources, and also offers in-depth support and help services to our users to enable the best possible performance and efficient use of the resources. NSC is and has been active in several European projects such as ENACTS, HPC4U, IS-ENES1, IS- ENES2, CLIPC, EGI and PRACE (1-ip through 5-ip). NSC also contributes directly to Swedish HPC research through its membership in the e-infrastructure organization SeRC (Swedish escience Research Centre). NSC also has a collaboration on HPC services to the MaxIV laboratory in Lund. The Swedish National Infrastructure for Computing (SNIC) is a national research infrastructure with a threefold mission: provide a balanced and cost-efficient set of resources and user support for large scale computation and data storage meet the needs of researchers from all scientific disciplines and from all Swedish universities and university colleges make the resources available through open application procedures such that the best Swedish research is supported SNIC is a distributed infrastructure funded in part by the Swedish Research Council (Vetenskapsrådet) and in part by the participating universities: Chalmers University of Technology, KTH Royal Institute of Technology, Linköping University, Lund University, Stockholm University, The University of Gothenburg, Umeå University and Uppsala University. Scope of the Upcoming Procurement The High Performance Computing resource AC2018 being procured is intended for Swedish academic research and will be funded by NSC partner organisation 2

3 Swedish National Infrastructure for Computing. The AC2018 resource will be purchased and owned by NSC. SNIC is funding the resource by way of the Swedish Research Council (VR), a part of the Swedish Ministry of Education and Research. Purpose of the New System The target audience for the AC2018 resource consists of Swedish academic researchers. Historically, computational load at NSC academic resources has been predominantely computational chemistry and computational material physics and this is not expected to change dramatically in the near future. The benchmarks selected for the evaluation of offers reflect this. Most of the resource load will consist of small to medium sized jobs, roughly up to 512 current x86_64 core equivalent jobs, and the system thus has a throughput focus predominantely. Specifically, the system is expected to have a workload profile like Academic codes with a focus on electronic structure codes and Molecular Dynamics, ~60% of compute time Vasp Gaussian, Gamess-US, NWChem, Molcas, Orca, Jaguar Gromacs, NAMD, Amber, LAMPPS, DL_POLY Quantum Espresso, CP2K, CPMD Climatology: EC-EARTH, WRF, Nemo, IFS CFD: Nek5000, Ansys FLUENT Long tail of less prominent/well-known community codes of similar nature to the above. A list is available at User specific codes, either compiled or interpreted, both serial and parallel. Development of all kinds of scientic codes. Serial, parallel and accelerator offloaded. System Description AC2018 may consist of possibly several parts; a bulk part for scientific data production, possibly equipped with GPUs to a significant extent, and a pre-/post processing, visualisation and development part. The pre-/post-processing part will need to be of a x86_64 architecture to support legacy software only available for this architecture. Its expected size lies in the order of tens of nodes, which 3

4 must be part of the offer within the same budget. The bulk part of the system can be one of x86_64, POWER or ARM64/Aarch64 architectures. AC2018 will be running a wide variety of applications in production. Three or possibly four of these will be used for benchmarking of performance and will be included in the evaluation, see section Benchmarks and Evaluation below. The benchmark applications are written in Fortran, C, C++ and CUDA and are parallel applications using MPI, or MPI and OpenMP for their parallelism. The operating system will be Linux. With the exception of the pre-/postprocessing part, the system should be based on processors of any of the following architectures: x86_64, ARM64/Aarch64, POWER. The x86_64 option includes alternatives AMD Zen and socket based Intel Xeon Phi as well as standard Intel Xeon of the Broadwell generation or newer. Detailed information will be issued with the Invitation To Tender. The system interconnect should be of a high speed RDMA capable type such as Infiniband, OmniPath or possibly RoCE/iWARP, the choice will be left to offering vendors. For the bulk part of the resource, vendors can choose to offer CPU-only equipped nodes or a combination of CPU-only equipped nodes and GPU-equipped nodes. GPU-equipped nodes can be offered as part of the system to the amount of 30% (++TAG_CHECKME++) of the budget (TCO). That is, a vendor can offer either a CPU-only equipped cluster or a cluster with 30% GPU-equipped nodes and the remainder with CPU-only equipped nodes. Disregarding the GPU, GPU and CPUonly nodes are not required to be identically configured, e.g. CPU model (aside from its architecture) and socket count need not be the same for the different node types. Offer evaluation will be based on fastest time to solution on a job mix of the specified benchmarks, see section Benchmarks. Current NSC HPC systems are Linux clusters running a common NSC software stack based on primarily, but not exclusively, open software, including: Linux, CentOS distribution Lustre and/or GPFS In-house developed software for installing, booting, monitoring, accounting and management built on Python, Django, Nagios, BitTorrent, Dracut, collectl, and many other packages. SLURM job management and job scheduling. Modules for managing software packages. Compiler wrappers and script for compilation and execution of MPI-jobs. 4

5 It is required that AC2018 will be able to run the NSC software stack. Red- Hat/CentOS 7 should fulfill this criteria. Mounting the GPFS file system is acceptable to do using NFS for architectures without native GPFS client support. The AC2018 is a compute only resource and will not include storage. Contract Size A preliminary estimate of the contract size is MSEK including computer equipment, installation and four years of service and support. It will depend on the power efficiency of tendered solution. Evaluation and Benchmarks The evaluation will be based on the performance over lifetime cost for the system. Benchmark performance figures will be used to evaluate offered solutions. NSC may modify the benchmarks until the time of the ITT publication. Any modifications done to the benchmarks or required performance figures will be advertised to registered vendors, see section Access. The benchmark codes being considered are VASP, Gromacs and CP2K. This list may be changed, but not very likely. Evaluation Tenders will be evaluated and ranked based on offered system performance in terms of fastest total time to run a job mix of benchmarks. This job mix could for instance be 1000 VASP jobs, Gromacs jobs and 500 CP2K jobs. The final number of the respective benchmark jobs will be decided at the time of the ITT publication, but will roughly correspond to a percentage of the total running time of: VASP, 50% Gromacs, 25% CP2K, 25% Achieving a high turnover rate on these benchmarks on the system as a whole is of course very important, but the individual jobs will also be required to pass a quality mark in terms of time to solution. It can be assumed that this time to 5

6 solution on individual jobs can be reached on approximately a compute power equivalence amounting to x86_64 CPU cores of recent date. No particular restriction will be put on what amount of hardware resources are used to reach the performance mark. That is, if two vendors would exactly meet the performance mark of a benchmark using for instance 256 and 384 CPU cores respectively, this will be of no consequence in the ranking between them, provided their offered full systems aggregate job throughput is the same (all else being equal). Vendor Specific Build Tools and Runtimes All vendor specific tools and libraries used to produce evaluation results must be included as part of the offer in reasonable amount to be used by the users of the resource. This could for instance be vendor MPI implementations, compilers and their associated runtimes and math libraries. Benchmarks Benchmark codes are VASP, Gromacs and CP2K. A good-practise guide for compilation and running these codes is provided in README files accompanying the respective benchmark archives. To facilitate competition on as equal terms as possible, a handful of directives, listed in the README file regarding building and running the benchmarks should be followed. These directives include for instance regulations around vendor code modifications and permissible optimisation techniques. File-IO in the benchmarks is kept low enough to isolate and measure CPU, Memory and MPI/interconnect performance. The purpose of the benchmarks is to enable vendors to get acquainted with the codes and disseminate guides for building and executing the benchmarks correctly with good performance. Other benchmarks, not used for tender evaluation purposes, may be included in the tender mainly aimed at testing components of the system. Performance numbers for multiple simultaneous executions of the benchmarks will be required for the evaluation. The number of simultaneous copies running of the benchmarks will be announced in the ITT. Node sharing of individual jobs will not be permitted in the evaluation. 6

7 Scientific Codes The following codes and versions will be used: VASP VASP version Feb16 will be used. See this FAQ entry for code access details. Gromacs Version of GROMACS will be used. Architecture specific computational kernels may be backported from the official GROMACS git repository, as may necessary CMake infrastructure for building these kernels. See for git repository access. Any such backport must be clearly accounted for by any vendor choosing to do so. CP2K Version 4.1 will be used Versions To keep track of changes we will provide a version number for each release of the benchmarks. Modifications to the benchmark package will be communicated via to all vendors who has received access to the benchmark package as described in Section Access below. Access The benchmark suites are available at To acquire access to these benchmark, you register by sending an with your contact information to Katja Ekström (contact information below) stating that you want access to the AC2018 benchmark suite. This is to have a possiblity to contact you if we find an error or has to update the suites for any reson. Modifications to the benchmark package will be communicated to all vendors who has received access to the benchmark package. Schedule Preliminary schedule for the tendering process for AC2018: 7

8 Date July 2017 August 2017 September 2017 October 2017 November 2017 Event Invitation to participate Invitation to tender Tender deadline Negotiations Award of Contract General Information Regarding the ITT The procurement will use negotiated procedure. Contact Information Contact person for the upcoming procurement: Name: Katja Ekström Position: Procurement officer 8