Cray Logo

blog  facebook  twitter  linkedin  google plus  youtube
HomeSupportCustom EngineeringIndustry SolutionsProgramsAbout Cray
ComputingStorageBig Data
Request Quotespacer
graphic Computing

IDC White Paper Landing Page

HPC Cluster Software Stack

Achieving exascale performance will require not only hardware innovations but validated and tested software tools. In answer to this challenge, Cray delivers customer-centric solutions based on industry-standard hardware with a validated and proven HPC software stack, providing customers with leading application performance and system availability. The operating system and application software layer offer support for x86 processors – Intel® and AMD – or a combination of hybrid processing with NVIDIA® Tesla® GPU accelerators and Intel® Xeon Phi™ coprocessors.

The HPC cluster software stack for Cray CS300 cluster supercomputers consists of:

  • Operating systems: Linux – Red Hat Enterprise Linux, CentOS and SUSE
  • Middleware applications and management: cluster monitoring, resource management, file system
  • HPC programming tools: development tools, performance monitoring and application libraries

Cray HPC Cluster Software Stack

Operating System

The operating system is the first level of the software stack. The HPC cluster software stack supports popular versions of Linux including Red Hat, CentOS, and SUSE. Since they are all derived from a common root, the main differences between the versions are installation programs and support.

The Cray CS300 cluster supercomputer supports Red Hat Enterprise Linux, SUSE and CentOS operating systems. Cray's Advanced Cluster Engine (ACE™) management software is delivered with Red Hat Enterprise Linux installed on the management nodes. Red Hat was selected because of the high level of support available. Cray delivers preconfigured Red Hat, SUSE or CentOS compute node images. Red Hat and SUSE are priced on a per node basis. The CentOS operating system is available as open source software that can be used at no cost.

Middleware Applications and Management

The HPC cluster software stack features Cray's Advanced Cluster Engine software at the core for providing node OS provisioning with standard operating systems including Red Hat Enterprise Linux and its derivatives such as CentOS and Scientific Linux as well as SUSE Linux. ACE has many additional features and functions including system level health monitoring and management, power monitoring, and network management including InfiniBand fabrics.

Cray builds on its stack a strong base of industry-leading software including schedulers such as Grid Engine Enterprise Edition resource manager and scheduler or SLURM (Simple Linux Utility for Resource Management). It is also compatible with other resource workload management and job scheduling software such as Altair’s PBS Professional, IBM Platform’s LSF and Torque/Maui.

Cray's HPC cluster software stack offers support for common file system options such as the Network File System (NFS), Panasas PanFS™ and Lustre. By providing multiple paths to the physical storage, Lustre enables the use of common storage technologies along with high-speed interconnects to scale well as an organization's storage needs grow.

HPC Programming Tools

HPC programming tools—including development tools, application libraries and performance monitoring—are the third level of the software stack. They support application generation, execution and debugging. The development tools provide flexibility with two choices of compilers, the Cray Compiler Environment (CCE) and Intel Cluster Studio XE. Customers can choose based on application performance requirements or the programming environment they're most familiar with.

Application libraries can be included wherever they're required, simplifying user programs and reducing programming time. Cray's HPC cluster software stack also provides a choice of application libraries including MVAPICH, Open MPI and Intel's MPI as well as Cray LibSci. Cray LibSci offers a library of highly tuned linear algebra solvers that enhance user CPU and GPU application performance based on problem size, improving productivity for Fortran, C and C++ programmers.

The software stack is completed with performance monitoring tools—including HPCC, Perfctr, IOR, PAPI/IPM and netperf—that are used to measure and verify cluster performance.