Cray® XK7™: Scalable Many-Core Performance
Adaptive Hybrid Computing
The Cray® XK7™ compute node combines AMD's 16-core Opteron™ 6200 Series processor and the NVIDIA® Tesla® K20 GPU Accelerator to create a hybrid unit with the intra-node scalability, power-efficiency of acceleration and flexibility to run applications with either scalar or accelerator components. This compute unit, combined with the excellent inter-node scalability of Cray’s Gemini interconnect, creates a system geared for any computing challenge.
Integrated Hardware Supervisory System
Cray's Hardware Supervisory System (HSS) integrates hardware and software components to provide system monitoring, fault identification and recovery. An independent system with its own control processors and supervisory network, the HSS monitors and manages all major hardware and software components in the Cray XK7 supercomputer. In addition to providing recovery services in the event of a hardware or software failure, HSS controls power-up, power-down and boot sequences, manages the interconnect, reroutes around failed interconnect links, and displays the machine state to the system administrator.
Cray XK7 System Resiliency
The Gemini interconnect is designed for large systems in which failures are to be expected and applications must run to successful completion in the presence of errors. Gemini uses error correcting code (ECC) to protect major memories and data paths within the device. In addition, the Cray Linux Environment (CLE) features NodeKARE™ (Node Knowledge and Reconfiguration). If a program terminates abnormally, NodeKARE automatically runs diagnostics on all involved compute nodes and removes any unhealthy ones from the compute pool. Subsequent jobs are allocated only to healthy nodes and run reliably to completion. The XK7 supercomputer’s Lustre® file system can be configured with object storage target failover and metadata server failover.
Extreme Scale and Cluster Compatibility
The Cray XK7 system provides complete workload flexibility — a single machine can run both a highly scalable custom workload and industry-standard ISV workload. CLE accomplishes this through Cluster Compatibility Mode (CCM). CCM allows immediate compatibility with Linux/x86 versions of ISV software — without recompilation or relinking — and allows for the use of various versions of MPI (such as, MPICH and Platform MPI™). The service is dynamic and available on an individual job basis.
Support for Other File System and Data Management Services
You can select the Lustre parallel file system or another option including connecting to an existing parallel file system. The Cray Data Virtualization Service allows for the projection of various other file systems (including NFS, GPFS™, Panasas® and StorNext®) to the compute and login nodes on the Cray XK7 system. Cray can also provide solutions for backup, archiving and data lifecycle management.
Many-core processing is the key to ultimate energy efficiency. Applications using the Cray XK7 GPU processors will experience industry-leading energy efficiency when measured for real application workloads. Combined with our standard air- or liquid-cooled high-efficiency cabinet and optional ECOphlex™ technology, the Cray XK7 system can reduce cooling costs and increase flexibility in datacenter design.