Dealing with Data: Choosing a Good Storage Technology for Your Application. Rick Wagner HPC Systems Manager July 1st, 2014"

Size: px
Start display at page:

Download "Dealing with Data: Choosing a Good Storage Technology for Your Application. Rick Wagner HPC Systems Manager July 1st, 2014""

Transcription

1 Dealing with Data: Choosing a Good Storage Technology for Your Application Rick Wagner HPC Systems Manager July st, 24" 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

2 Application Focus" Storage choices should be driven by application need, not just what s available." But, applications need to adapt as they scale." Writing a few small files to an" NFS server is fine writing s simultaneously will wipe out the server." If you use binary files," don t invent your own format. Consider HDF5." 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

3 Storage Technologies" File Systems" Devices" Services" ext4! NFS! Lustre! PVFS! FUSE! memory! block! Cloud! MySQL! CouchDB! 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

4 Storage Technologies" File Systems" Devices" Services" ext4! NFS! Lustre! memory! block! Cloud! MySQL! CouchDB! PVFS! FUSE! Each has its own performance characteristics"! Not all are available everywhere" 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

5 File Systems" Classic access, POSIX, Windows" " Most relevant:" Local! Remote! NFS, CIFS! Parallel (Lustre, GPFS)! Local file systems are good for small and temporary files" " Network file systems very convenient for sharing data between systems" 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

6 Parallel File Systems" IS 53 MGT CONSOLE 36 STATUS PSU PSU 2 FAN RST Flash I/O Node 6 Compute Nodes Rail Dual GbE IS 53 MGT 35 CONSOLE 36 STATUS PSU PSU 2 FAN RST Lustre Filesystem Each switch connected to its 6 neighbors via 3 QDR links MGT IS 53 CONSOLE 36 STATUS PSU PSU 2 FAN RST Dual GbE Flash I/O Node 6 Compute Nodes MGT Rail IS 53 CONSOLE 36 STATUS PSU PSU 2 FAN RST 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

7 Parallel File Systems" TRESTLES IB cluster GORDON IB cluster TRITON Myrinet cluster Mellanox 52 Bridge 2 GB/s 64 Lustre LNET Routers GB/s Myrinet G Switch 25 GB/s 3 Dis9nct Network Architectures MDS MDS Arista 758 G Arista 758 G Redundant Switches for Reliability and Performance MDS Metadata Servers OSS 72TB OSS 72TB OSS 72TB OSS 72TB 32 OSS (Object Storage Servers) Provide GB/s Performance and >4PB Raw Capacity 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

8 A Cautionary Tale" 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

9 Devices" Raw block device (/dev/sdb) or RAM FS (/dev/shm)" Useful in specific cases, like fast scratch" Can be very good for small I/O" 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

10 Services" Things accessed programmatically!! Frequents the last thought for HPC applications: A MISTAKE!! Databases! Cloud storage (Amazon S3)! Document storage (MongoDB, CouchDB)! 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

11 Know What You Need" 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

12 Order of Magnitude Guide" Storage" file/directory" file sizes" BW" IOPs" Local HDD! s! GB! MB/s!! Local SSD! s! GB! GB/s!! RAM FS! s! GB! GB/s!! NFS! s! GB! MB/s!! Lustre/GPFS! s! TB! GB/s!! Cloud! Infinite! TB! GB/s!! DB! N/A! N/A! N/A!! 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

13 My application needs to:" Choosing" Write a checkpoint dump from memory from a large parallel simulation.! I should consider:" A parallel file system and a binary file format like HDF5.! 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

14 My application needs to:" Choosing" Run analysis on remote systems and return the results to a web portal for users.! I should consider:" Cloud storage for results and input, and local scratch space for the job.! 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

15 My application needs to:" Choosing" Randomly access many small files, or read and write small blocks from large files.! I should consider:" A database, RAM FS, or local scratch space.! 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California

16 Many Boxes Make a Sad Panda" Database logos courtesy of RRZEicons! 23 Summer Institute: Discover Big Data, August 5-9, San Diego, California