HTCaaS: Leveraging Distributed Supercomputing Infrastructures for Large- Scale Scientific Computing

Similar documents
High Performance Computing(HPC) & Software Stack

Computer clusters using commodity processors, network interconnects, and operating systems.

<Insert Picture Here> Oracle Exalogic Elastic Cloud: Revolutionizing the Datacenter

IBM Tivoli Workload Automation View, Control and Automate Composite Workloads

IBM Power Systems. Bringing Choice and Differentiation to Linux Infrastructure

Metrics & Benchmarks

On Cloud Computational Models and the Heterogeneity Challenge

Accelerate Insights with Topology, High Throughput and Power Advancements

CLOUD MANAGEMENT PACK FOR ORACLE FUSION MIDDLEWARE

Guide to Modernize Your Enterprise Data Warehouse How to Migrate to a Hadoop-based Big Data Lake

A Examcollection.Premium.Exam.35q

Introduction. DIRAC Project

High-Performance Computing (HPC) Up-close

IBM Tivoli Workload Scheduler

Integration of Titan supercomputer at OLCF with ATLAS Production System

HPC in Bioinformatics and Genomics. Daniel Kahn, Clément Rezvoy and Frédéric Vivien Lyon 1 University & INRIA HELIX team LIP-ENS & INRIA GRAAL team

[Header]: Demystifying Oracle Bare Metal Cloud Services

Improve Your Productivity with Simplified Cluster Management. Copyright 2010 Platform Computing Corporation. All Rights Reserved.

Achieving Agility and Flexibility in Big Data Analytics with the Urika -GX Agile Analytics Platform

February 14, 2006 GSA-WG at GGF16 Athens, Greece. Ignacio Martín Llorente GridWay Project

Powering Statistical Genetics with the Grid: Using GridWay to Automate R Workflows

Table of Contents. 1. Executive Summary reasons to use UNICORE. 3. UNICORE Forum e.v. 4. Client Software/Web Portal

HPC Solutions. Marc Mendez-Bermond

Features and Capabilities. Assess.

Open Science Grid Ecosystem

Getting Started with vrealize Operations First Published On: Last Updated On:

Towards Autonomic Virtual Applications in the In-VIGO System

Grid Activities in KISTI. March 11, 2010 Soonwook Hwang KISTI, Korea

Get The Best Out Of Oracle Scheduler

IBM Spectrum Scale. Advanced storage management of unstructured data for cloud, big data, analytics, objects and more. Highlights

2nd EUDAT User Forum

RDMA Hadoop, Spark, and HBase middleware on the XSEDE Comet HPC resource.

UAB Condor Pilot UAB IT Research Comptuing June 2012

How Jukin Media Leverages Metricly for Performance and Capacity Monitoring

How Oracle Uses Fusion Middleware: SOA, BPEL, BI, Identity Management, and ECM Inside Oracle

The IBM and Oracle alliance. Power architecture

Clairvoyant Site Allocation of Jobs with Highly Variable Service Demands in a Computational Grid

Addressing the I/O bottleneck of HPC workloads. Professor Mark Parsons NEXTGenIO Project Chairman Director, EPCC

Exalogic Elastic Cloud

Microsoft Dynamics AX 2012 Day in the life benchmark summary

InfoSphere DataStage Grid Solution

DIET: New Developments and Recent Results

RODOD Performance Test on Exalogic and Exadata Engineered Systems

Cray and Allinea. Maximizing Developer Productivity and HPC Resource Utilization. SHPCP HPC Theater SEG 2016 C O M P U T E S T O R E A N A L Y Z E 1

Enabling Resource Sharing between Transactional and Batch Workloads Using Dynamic Application Placement

IBM Accelerating Technical Computing

CROWNBench: A Grid Performance Testing System Using Customizable Synthetic Workload

Deep Learning Acceleration with

100 Million Subscriber Performance Test Whitepaper:

Job Scheduling Challenges of Different Size Organizations

Top 5 Challenges for Hadoop MapReduce in the Enterprise. Whitepaper - May /9/11

Moab and TORQUE Achieve High Utilization of Flagship NERSC XT4 System

Delivering High Performance for Financial Models and Risk Analytics

Oracle Financial Services Revenue Management and Billing V2.3 Performance Stress Test on Exalogic X3-2 & Exadata X3-2

Grid 2.0 : Entering the new age of Grid in Financial Services

Sizing SAP Central Process Scheduling 8.0 by Redwood

CPU scheduling. CPU Scheduling

OPERATING SYSTEMS. Systems and Models. CS 3502 Spring Chapter 03

WHITE PAPER. CA Nimsoft APIs. keys to effective service management. agility made possible

Integration of Titan supercomputer at OLCF with ATLAS Production System

This presentation is for informational purposes only and may not be incorporated into a contract or agreement.

ORACLE CLOUD MANAGEMENT PACK FOR MIDDLEWARE

Using and Managing a Linux Cluster as a Server

A Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems

Oracle Cloud Blueprint and Roadmap Service. 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.

ETL on Hadoop What is Required

HPC Workload Management Tools: Tech Brief Update

Services Guide April The following is a description of the services offered by PriorIT Consulting, LLC.

Communication-Aware Job Placement Policies for the KOALA Grid Scheduler

Learn How To Implement Cloud on System z. Delivering and optimizing private cloud on System z with Integrated Service Management

CMS readiness for multi-core workload scheduling

ORACLE DATABASE PERFORMANCE: VMWARE CLOUD ON AWS PERFORMANCE STUDY JULY 2018

I/O Performance and I/O Performance Isolation

In Cloud, Do MTC or HTC Service Providers Benefit from the Economies of Scale?

Job co-allocation strategies for multiple high performance computing clusters

OpenWLC: A Workload Characterization System. Hong Ong, Rajagopal Subramaniyan, R. Scott Studham ORNL July 19, 2005

Azure Stack. Unified Application Management on Azure and Beyond

A Practice of Cloud Computing for HPC & Other Applications. Matthew Huang Sun Microsystems, a subsidiary of Oracle Corp.

White paper A Reference Model for High Performance Data Analytics(HPDA) using an HPC infrastructure

ECLIPSE 2012 Performance Benchmark and Profiling. August 2012

Architecting Microsoft Azure Solutions

On the Comparison of CPLEX-Computed Job Schedules with the Self-Tuning dynp Job Scheduler

Optimizing Fine-grained Communication in a Biomolecular Simulation Application on Cray XK6

Load DynamiX Enterprise 5.2

locuz.com Unified HPC Cluster Manager

AI-Ckpt: Leveraging Memory Access Patterns for Adaptive Asynchronous Incremental Checkpointing

Planning the Capacity of a Web Server: An Experience Report D. Menascé. All Rights Reserved.

Oracle Cloud for the Enterprise John Mishriky Director, NAS Strategy & Business Development

IBM WebSphere Front Office for Financial Markets delivers a flexible, high-throughput, low-latency, front-office platform

IBM Tivoli Monitoring

Comparing Application Performance on HPC-based Hadoop Platforms with Local Storage and Dedicated Storage

Ch. 6: Understanding and Characterizing the Workload

An IBM Proof of Technology IBM Workload Deployer Overview

PRIORITY BASED SCHEDULING IN CLOUD COMPUTING BASED ON TASK AWARE TECHNIQUE

SCALARi500. SCALAR i500. The Intelligent Midrange Library Platform FEATURES AND BENEFITS

SAFE Resources. SAFE User Guide

Optimizing Grid-Based Workflow Execution

UAB Expanding the Campus Compu7ng Cloud

ACCELERATING GENOMIC ANALYSIS ON THE CLOUD. Enabling the PanCancer Analysis of Whole Genomes (PCAWG) consortia to analyze thousands of genomes

Architecting Microsoft Azure Solutions

Transcription:

HTCaaS: Leveraging Distributed Supercomputing Infrastructures for Large- Scale Scientific Computing Jik-Soo Kim, Ph.D National Institute of Supercomputing and Networking(NISN) at KISTI

Table of Contents Introduction HTCaaS: High-Throughput Computing as a Service Evaluation Conclusions & Discussions Slide #2

Introduction From HTC to Many-Task Computing (MTC) [MTAGS 08] A very large number of tasks (millions or even billions) Relatively short per task execution times (sec to min) Data intensive tasks (i.e., tens of MB of I/O per second) A large variance of task execution times (i.e., ranging from hundreds of milliseconds to hours) Communication-intensive, however, not based on message passing interface (such as MPI) but through files astronomy, physics, pharmaceuticals, chemistry, etc. Slide #3

Introduction Middleware Systems for HTC/MTC applications Ease of Use Minimize user overhead for handling jobs and resources Efficient Task Dispatching The overhead of task dispatching should be low enough Adaptiveness adjust acquired resources according to changing load Fairness ensure fairness among multiple users submitting various numbers of tasks Reliability Failed or suspended tasks should be automatically resubmitted and managed Resource Integration effectively integrate as many computing resources as possible Slide #4

Introduction Our Approach High-Throughput Computing As a Service Meta-Job based automatic job split & submission e.g., parameter sweeps or N-body calculations Agent-based multi-level scheduling User-level Scheduling and Dynamic Fairness Pluggable interface to heterogeneous computing resources Supporting many client interfaces Native WS-interface, Java API Easy-to-use client tools (CLI/GUI/Web portal) HTCaaS is currently running as a pilot service on top of PLSI supporting a number of scientific applications from pharmaceutical domain and high-energy physics Slide #5

Table of Contents Introduction HTCaaS: High-Throughput Computing as a Service Evaluation Conclusions & Discussions Slide #6

High-Throughput Computing as a Service System Architecture Slide #7

High-Throughput Computing as a Service > htcaas Account Manager Certificate Monitoring Manager Job Manager Agent Job Queue Agent Agent JSDL: Parameter Sweep Job Extension User Data Manager Agent Manager Agent Agent Agent submission HTCaaS 8

High-Throughput Computing as a Service Job Queues and Agents Management maintains separate job queues and agents per user reducing complexities of accounting and scheduling track and meter the usage of PLSI computing resources per user carefully calibrating the number of agents per user can address the problem of fair resource sharing among multiple users Each agent actively pulls the tasks from its dedicated job queue which corresponds to a specific user if there are no more tasks to be processed, it automatically releases the acquired resources and exits Monitoring and Fault-tolerance periodically checks the status of agents and tasks If some of agents or tasks fail, the Monitoring Manager informs the Agent Manager (or the Job Manager) to resubmit the failed agents (tasks) and manage them (addressing Reliability) Slide #9

High-Throughput Computing as a Service User-level Scheduling and Dynamic Fairness Dynamic fair resource sharing algorithm divides all available computing resources fairly across all demanding users in the system (when the system is heavily loaded) and exploits dynamic adjustment of acquired resources as free computing resources become available (as the overall system becomes lightly loaded) Resource Allotment Function Weight(p) represents the weight of a user p» can consider many different factors such as the number of tasks submitted by the user p, task running time, priority, etc.» Weighted Fairness addressing Fairness and Adaptiveness Slide #10

High-Throughput Computing as a Service Timeline & Events Computing Resources RA Job Queues t 0 All 100 cores are free 100 free cores t 1 User A arrives and submits 200 tasks RA(A)=100 t 2 User B arrives and submits 50 tasks User A s 100 tasks RA(A)=50 RA(B)=50 A: 100 tasks(du) t 3 t 4 t 5 User A release 50 cores User B acquires 50 cores User C arrives and submits 20 tasks User A s 50 tasks User B s 50 tasks RA(A)=50 RA(C)=20 A: 50 tasks(du) B: 0 tasks(su) A: 50 tasks(du) t 6 User B s tasks finish C: 20 tasks(du) t 7 t 8 User C acquires 20 cores User A acquires additional 30 cores User A s 50 tasks C s 20 tasks User A s 80 tasks 30 free cores C s 20 tasks A: 20 tasks(du) C: 0 tasks(su) Slide #11

Table of Contents Introduction HTCaaS: High-Throughput Computing as a Service Evaluation Conclusions & Discussions Slide #12

Evaluation PLSI (Partnership & Leadership for the nationwide Supercomputing Infrastructure) provide researchers with an integrated view of geographically distributed supercomputing infrastructures to solve complex and demanding scientific problems consisting of multiple organizations connected via a dedicated 1Gbps network [100TFLops of computing power, 1,115 nodes with 8,508 CPU cores] Slide #13

Evaluation PLSI provides a common software stack Accounting (based on LDAP), Monitoring (via Nagios), Global scheduling (based on LoadLeveler) and a Global shared storage system (based on GPFS) utilize the LoadLeveler as a job submission system to available computing nodes in the PLSI configured as a multi-cluster environment exploits a total 400TB of global home/scratch directories mounted at every computing node as a shared storage system for input/output data and executables Slide #14

Evaluation Micro-Benchmark Experiments simulating a large number of short-running tasks (sleep 10) and relatively long-running tasks (sleep 100) glory cluster in KISTI (300 cores) Performance Metrics Makespan (time to complete a bag of tasks) Efficiency (comparison with ideal parallelism) Comparison Models HTCaaS with DB Manager connections (HTCaaS(DB)) HTCaaS without DB Manager interactions (HTCaaS(No DB)) LoadLeveler+GPFS (LoadLeveler) Slide #15

Evaluation Micro-Benchmark Experiments For short running tasks, HTCaaS clearly outperforms LoadLeveler For relatively long running tasks, overheads of task dispatching can be effectively countervailed 57% 35% sleep 10 sleep 100 HTCaaS are still being improved 5% improvements Slide #16

Evaluation Micro-Benchmark Experiments Job holding problem during the course of LoadLeveler dispatching due to multiple simultaneous I/O operations on the GPFS HTCaaS can effectively leverage the local storage of the computing resource User Data Manager manages overall input/output data staging support data-intensive HTC/MTC applications where typical size of a single input data is relatively small (from hundreds of KBs to MBs) sleep 10 sleep 100 Slide #17

Evaluation Protein Docking Experiments Autodock, a suite of automated docking tools perform the docking of ligands to a set of target proteins to discover new drugs for several serious diseases such as SARS or Malaria Experimental setup Clusters: glory, kobic, gene and helix Four different users are sequentially arriving at our system and submit various numbers of tasks (ligands) with an average inter-arrival time of 10 minutes from 1000 down to 100 for the single cluster from 2000 down to 250 for the multi-cluster Comparison Models HTCaaS with dynamic fair resource sharing algorithm dynamic fairness (DF) Simple resource partitioning algorithm strict fairness (Simple) Slide #18

Evaluation Protein Docking Experiments HTCaaS can reduce the performance gap among users with various numbers of tasks HTCaaS can seamlessly integrate multiple geographically distributed clusters fully utilize available computing resources as they become available, while Simple inevitably wastes idle computing resources resource waste improving fairness 64% Slide #19

Table of Contents Introduction HTCaaS: High-Throughput Computing as a Service Evaluation Conclusions & Discussions Slide #20

Conclusion HTCaaS to provide researchers with ease of exploring large-scale and complex HTC/MTC problems by integrating distributed national supercomputers Employing the concept of Meta-Job (with easy-to-use client tools) and a fault-tolerant agent-based multi-level scheduling mechanism (Ease of Use, Efficient Task Dispatching and Reliability) Employing dynamic fair resource sharing mechanism (Fairness and Adaptiveness) HTCaaS effectively leverages local disks of geographically distributed computing resources support data-intensive HTC/MTC applications HTCaaS can seamlessly utilize all of available computing resources without resource wastage (Resource Integration) Slide #21

Discussion Future Work supporting more complex workloads consisting of HTC and HPC tasks improving the scalability of HTCaaS, and applying job profiling technique to realize the weighted form of fairness How we can more efficiently support data-intensive MTC applications on top of geographically distributed computing environments such as PLSI? Shared storage system such as GPFS will be performance bottleneck Data caching may work but it depends on the workload characteristics Slide #22

Thank you! National Institute of Supercomputing and Networking 2013