Cray® XC™ Series: Adaptive Supercomputing
Extreme Scalability and Sustained Performance
Cray has an established reputation for regularly running the biggest jobs on the largest numbers of nodes in the HPC industry. The Cray® XC™ series puts even more focus on solving extreme capability computational challenges. Cray XC series systems scale hardware, networking and software across a broad performance spectrum to deliver true sustained, real-world production performance.
Aries™ Interconnect and Dragonfly Topology
To provide this breakthrough performance and scalability, Cray XC series supercomputers integrate the HPC-optimized Aries interconnect. This innovative intercommunications technology, implemented with a high-bandwidth, low-diameter network topology called Dragonfly, provides substantial improvements on all of the network performance metrics for HPC: bandwidth, latency, message rate and more. Delivering global bandwidth scalability at reasonable cost across a distributed memory system, this network gives programmers global access to all of the memory of parallel applications and supports the most demanding global communication patterns.
Cray XC series systems utilize Dragonfly network topology, constructed from a configurable mix of backplane, copper and optical links, providing scalable global bandwidth, avoiding expensive external switches and enabling easy in-place upgrades for growing bandwidth requirements in the future. Cray XC air-cooled systems utilize backplane and copper cabling only to reduce costs for technical enterprise applications.
The Aries ASIC provides the network interconnect for compute nodes on Cray XC series base blades and implements a standard PCI Express Gen3 host interface, supporting a wide range of HPC processing compute engines. The universal nature of Cray XC series open architecture allows the system to be configured with the best available devices today, then augmented or upgraded in the future with the user’s choice of processors, coprocessors and accelerators using processor daughter cards.
Intel® Xeon® Multi-Core Processors
Cray XC series systems use industry-leading Intel Xeon processors, scaling in excess of one million cores. This architecture implements two processor engines per compute node and has four compute nodes per blade. Compute blades stack in eight pairs (16 to a chassis), and each cabinet can be populated with up to three chassis, culminating in up to 384 sockets per cabinet.
The Intel Xeon multi-core processors provide up to 8,448 cores and enable 297 teraflops per Cray XC liquid-cooled cabinet, and 99 teraflops per Cray XC air-cooled cabinet. Future processor upgrades will boost clock frequency and bump the number of embedded cores, accelerating overall system performance. The open architecture of the Cray XC series offers intranode flexibility, empowering users with the option to run applications with either scalar or accelerator processing elements depending on their requirements for parallelism.
Intel® Xeon Phi™ Many-Core Processors
Cray XC series supercomputers also support Intel Xeon Phi many-core processors. These “many-core” processor compute blades are self-hosting and can be used in homogenous Cray XC systems or in a hybrid mode together with Intel Xeon multi-core processors. The Cray XC supercomputer architecture supports heterogeneous systems so users always have the most optimal processor type and compute node structure to best support their diverse applications. These Intel Xeon Phi processor compute blades support four single-socket compute nodes per blade, or up to 586 TF/cabinet of peak performance.
NVIDIA® Tesla® GPU Accelerators
Cray XC series supercomputers support CPU-hosted NVIDIA Tesla GPU accelerators. Two options are available: the NVIDIA Tesla K40 for the XC40 system and the NVIDIA Tesla P100 PCIe for the Cray XC50 system. NVIDIA’s P100 GPU accelerator delivers over 3,500 embedded cores and flexible mixed-precision computing options. The P100 offers flexible double-precision, single-precision or half-precision compute operation and also integrates high-bandwidth memory into the package, enabling up to 3x memory bandwidth improvements over prior-generation external-memory GPU solutions.
The Cray XC50 system with the Tesla P100 delivers superior application performance, memory bandwidth and performance per watt. Cray also supports multiple programming models for the P100 GPU accelerator, including the Cray compiler, OpenACC directives-based coding and CUDA.
Custom and ISV Jobs on the Same System — Extreme Scale and Cluster Compatibility
Based on generations of experience with both environments, Cray has leveraged a single machine architecture to run both highly scalable custom workloads as well as industry-standard ISV jobs via the powerful Cray Linux Environment (CLE). CLE enables a Cluster Compatibility Mode (CCM) to run Linux/x86 versions of ISV software without any requirement for porting, recompiling or relinking. Alternatively, Cray’s Extreme Scalability Mode (ESM) can be set to run in a performance-optimized scenario for custom codes. These flexible and optimized operation modes are dynamic and available to the user on an individual job basis.
ROI, Upgradeability and Investment Protection
Besides being customizable for each user’s requirements, the Cray XC series supercomputer architecture is engineered for easy, flexible upgrades and expansion, a benefit that prolongs its productive lifetime and the user’s investment. As new technology advancements become available, users can take advantage of these next-generation progressions deep into the life cycle before ever considering replacing an HPC system.