The humorous statement by computer engineer Ken Batcher that “a supercomputer is a device for turning compute-bound problems into I/O-bound problems” is becoming more genuine, since I/O subsystems are typically slow compared to others parts of a supercomputer. This is mainly due to the well-known performance gap that keeps outspreading between the computing components (focusing more on the speed) and the storage devices (focusing more on the capacity of storage and less on performance).
In June 2015, King Abdullah University of Science and Technology (KAUST) put into production “Shaheen II,” a 36-cabinet Cray® XC40™ supercomputer delivering a theoretical peak performance of 7.2 PF, along with a 17 PB Cray® Sonexion® 2000 parallel file system. The parallel file system is designed to provide over 500 GB/s of I/O throughput. The current workload on Shaheen is diverse and includes data-intensive projects such as seismic imaging, computational fluid dynamics, weather and climate modeling and biology applications.
Utilizing Darshan, an I/O characterization and profiling tool in production on Shaheen, the KAUST Supercomputing Lab (KSL) team is able to capture applications’ I/O behavior on the system and therefore tune the heavy I/O applications to take advantage of the parallel file system. However, the tuning and optimization are sometimes not satisfactory, especially in applications generating data faster than the file system can handle, or are quite tedious, making users reluctant to rewrite their codes.
As part of the Shaheen procurement, in November 2015 Cray completed the installation of 268 Cray® DataWarp™ accelerator nodes hosting a total of 536 Intel SSD cards. The combination provides an aggregate burst buffer capacity of 1.56 PB to Shaheen users.This fast middle storage layer provides up to three times the performance of the Lustre® parallel file system.
With the support of the KSL team, Cray’s Joe Glenski and the Cray performance team, the IOR benchmark was launched on Shaheen using all 268 DataWarp accelerator nodes and 5,628 compute nodes and achieving 1.54 TB/s and 1.66 TB/s in IOR write and IOR read, respectively. To the best of our knowledge, these are the highest IOR performance numbers ever obtained on any single parallel file system in the world.
Shaheen users will greatly benefit from this technology without any change in their applications by only updating their SLURM job submission scripts. Early experiments on seismic applications and climate code achieved around 30 percent improvement.
Seismic imaging applications developed at KAUST, such as reverse time migration and full waveform inversion, will benefit tremendously from the new installation of this fast parallel I/O layer. Early results showed a performance improvement of up to 34 percent in one of the most I/O-intensive algorithms in seismic imaging.
“The Shaheen supercomputer brought a new life to my group,” said Gerard Schuster, a KAUST professor of Earth Science and Engineering. “For example, we were able to perform elastic least squares natural migration of seismic data recorded over several months in Long Beach, California, by a 3D recording array. The migration results confirmed the presence of known faults in this area, and revealed the existence of unknown faults that did not break the surface. I believe this will not only lead to revisions in the earthquake hazard assessment of the Long Beach area, but his technique will also be adopted by the general earthquake community. Imaging of the entire Long Beach data would not have been practical without the computational power of Shaheen and the I/O performance improvements.”
Research relating to climate prediction and environmental modeling has been greatly enhanced by the performance of Shaheen. The work of KAUST Professor of Earth Science and Engineering, Georgiy Stenchikov, focuses on calculating the radiative forcing of dust aerosols in the Middle East and their impact on the regional circulation and temperature distribution. His work utilizes the unique global high-resolution atmospheric model HiRAM, initially developed at the NOAA Geophysical Fluid Dynamics Laboratory, that runs with 100 times higher spatial resolution than the conventional global models. The aim of his research is to develop a solid scientific basis to support the environmental policy and mitigation measures of Saudi Arabia.
Valerio Pascucci, a visiting professor at KAUST and a professor at the University of Utah, said, “Our experience with scaling PIDX on Shaheen has already shown that the Cray architecture allows us to achieve very high performance I/O in production settings. With the introduction of the burst buffer, we expect to further multiply the performance that can be provided to scientists in real applications. One important factor will be the increased asynchronicity of the physical I/O with the use of the burst buffer that will allow better overlapping of data storage and computing. More importantly, we plan to take advantage of the burst buffer for fast checkpoint dumps and restarts without disk I/O, which will help address the resiliency challenges that will be increasingly present with the advent of exascale computing. An additional component we plan to focus on is the use of the burst buffer as a staging area for in-situ data analytics with dramatic improvement of the science output and data management reduction.”
With the state-of-the-art equipment on Shaheen, the programming environment provided by Cray and the support of KSL computational scientists, KAUST researchers have all the ingredients to make scientific discoveries faster and more efficiently.
To learn more about Cray’s DataWarp technology, please visit: http://www.cray.com/products/computing/xc-series?tab=datawarp