Cray recently introduced the DataWarp™ applications I/O accelerator. The National Nuclear Security Administration will deploy this technology at a large scale in the “Trinity” system next year. This is a remarkable trend in the market, and for Cray it is reinvention of a technology that was deployed 30 years ago.
Cray introduced the technology in its supercomputers in the early 1980s. The company offered solid-state storage devices (SSDs) for temporary storage of datasets as an extension to the Cray X-MP, Cray-1 S and Cray-1M systems to significantly increase data transfer rate. This “burst rate” was dependent upon SSD memory size and configuration. The SSD was housed in a stand-alone four-column chassis. All power supplies and cooling systems were similar to the Cray mainframe systems.
The SSD was used as a fast-access device for large pre-staged or intermediate files generated and manipulated repetitively by user programs. Datasets could be assigned to the SSD by a single Cray Operating System (COS) control statement without modifying the application. Memory sizes were 256, 512 or 1,024 MB. The fastest port connected directly to the Cray X-MP mainframe using one channel allowed a transfer of up to 1,000 MB/s, not so different from today’s commodity SSDs. Data protection (SECDED) was integrated.
Today we are again experiencing a widening gap between computational performance and disk-based storage performance. For a spectrum of applications with demanding I/O workloads it is beneficial to bring the fast I/O devices as close as possible to the compute resources. The XC40™ with the Cray Aries HPC interconnect, which provides ultra-low latency and superior per-node injection bandwidth, allows a high-speed connection to I/O nodes with enterprise-level SSDs. These new tier of high performance storage fills the performance gap, for example as local file cache. SSDs provide higher bandwidth and better random access characteristics to address small block I/O. The DataWarp system’s I/O blades are XC series blades with integrated PCIe Gen3-based SSDs.
The technology is highly beneficial for many CAE applications, such as MSC Nastran, Abaqus and ANSYS. They represent the more I/O-intensive group of CAE workloads in the market and typically require millions of input/output operations per second, reading and writing small files. Ultrafast local SSD storage connected to the compute nodes by Cray’s Aries interconnect technology leads to a significant acceleration for this group of applications when compared to disk based file systems .
The DataWarp accelerator provides a new level of flexibility, and enables users to allocate the appropriate type and amount of data storage and I/O movement per job, process, rank or node. Storage is dynamically allocated to maximize compute and storage utilization across the entire system – you can put a scratch file system on every node or a burst capability local to compute nodes for faster application checkpoint restart. SSDs meet the burst bandwidth requirements, enabling disk based storage to be provisioned for average load rather than peak. DataWarp allows the full flexibility for creating pools of SSD storage. It helps to handle “bursty” I/O-intensive applications, saving users storage costs. One usage model is a “per-node tmp/scratch.” Each compute node in a job is assigned a private part of the allocated SSD space or as dynamic compute node swap space. The SSDs can also be used as a cache, temporarily storing data before it’s moved onto the compute or external storage. It also helps to ensure the peak performance of the Lustre parallel file system by absorbing surge peaks from single applications. A typical use case is job checkpoints that are written to SSDs followed by an asynchronous explicit or transparent copy out to rotating storage. These checkpoints can be huge and might affect overall I/O system performance of the parallel file system.
As Cray moves forward with installing 3 PB of ultra-fast SSD devices at Trinity, the company is continuously working on additional usage models for local and shared I/O. The long-term vision addresses a seamless end-to-end tiered storage solution over DRAM, NVRAM, SSD, Lustre, disk archive and tape for full data protection and management to ensure users have the right data on the right media.