Brought to you by Cray, the world’s leader in application- and compute-integrated high performance storage for big data and supercomputing, the Cray® Sonexion® 3000 scale-out Lustre system provides the cornerstone of Cray’s end-to-end storage solutions, performance engineered and supported by Cray to reduce TCO, scale efficiently and optimize performance at scale.
As a leader in open systems and parallel file systems, and co-founder and benefactor of OpenSFS, Cray builds on community-driven Lustre to unlock the performance of popular x86 Linux® compute clusters and supercomputers using Cray’s proven storage system architectures.
The Cray Sonexion system supports InfiniBand™ (EDR, FDR) and 40/100 GbE as well as a broad number of popular of Linux distributions through the Lustre Client by Cray.
Reduce TCO by 25 Percent
Through its fully integrated and preconfigured design, Cray Sonexion storage simplifies installation and deployment and reduces the total number of components to manage compared to conventional server- and SAN-based Lustre offerings. The Cray Sonexion system's compact design reduces the total datacenter footprint of petascale systems by up to 30 percent over competing solutions. Using less power also provides big savings, reducing the operating costs by 15 to 20 percent over monolithic SAN-based systems.
As the system needs to be expanded, scalable storage units (SSUs) are added, each providing an efficient, balanced mix of capacity and performance, like a building block. Depending on the type of hard drive used (4 TB, 6 TB or 8 TB), different levels of capacity and performance can be added to the system. In a performance-optimized configuration using the latest HPC drives from Seagate, the Cray Sonexion 3000 system provides 38 percent more sustained performance than the competition in the same amount space — achieving 98 GB/s sustained throughput and topping out a whopping 112 GB/s per rack. For sustained performance, that equates to 14 GB/s per SSU with up to 7 SSUs in a single expansion rack. From there, customers can add subsequent SSUs and racks, scaling performance as high as 1.7 TB/s in a single file system.
Capacity also scales in modular increments, depending on drive type. In a capacity-optimized configuration, the Sonexion 3000 system stores about 3.4 usable petabytes in a single rack. Fewer drives and components reduce capital costs as capacity grows.
Performance-Optimized End to End
Cray is well known as the leader in supercomputing. The knowledge and skills developed through a rich history delivering and supporting the world’s fastest systems is founded on deep, holistic systems expertise — including optimizing I/O subsystems. Cray’s end-to-end system architectures are optimized based on the needs of the application workload and compute system, and tuned as needed across all aspects of the system — from the compute clients to the network to the drives.
Today, Lustre is the primary parallel file system of choice used by Cray for big data and supercomputing. Cray’s first deployment goes back to Sandia Labs’ “Red Storm” system in 2004. Over the years, Cray’s experience in both Lustre and holistic systems design has given the company a large and growing foundation in Lustre expertise — and especially in delivering Lustre at scale. This experience benefits a diversity of compute platforms running Linux — as well as Cray supercomputers.
Single Point of Support for Everything
Cray provides direct customer support for the entire solution, including all software and hardware. Cray’s superior knowledge of Lustre, networking and interfaces into the compute platform ensures that the highest levels of knowledge are available to troubleshoot any problems that may arise. If the Cray Sonexion system connects to a Cray analytics or compute environment, Cray will also provide a single point of support for the entire solution.
Data Protection and Connectivity to Third-Party Archives
New software-based Grid RAID offers higher levels of data protection and up to 3.5 times faster rebuild times than traditional RAID6 and MD-RAID storage.
Cray ensures quality, reliability and stability at scale through exhaustive thermal and real-world stress testing, system hardening and availability, and tight hardware and software integration.
Cray offers a software-only partner solution for data tiering and archiving. This option allows data to be migrated between Lustre and the Versity Storage Manager hierarchical storage manager to ensure data can be protected nearline, offline or using an active archive tape systems.
The Cray® Sonexion® storage system’s modular, compact design keeps costs low while delivering the right performance for compute clusters and applications of all types. Its compact form factor reduces the total storage hardware infrastructure (cables, servers, components and racks) required for sustaining production-grade, petascale deployments — by 30 percent on average over SAN-based configurations.
The Sonexion 3000 system’s hardware architecture consists of a preconfigured, 42U (unit) rack-level storage system that can be easily expanded using modular storage building blocks. The principal hardware components include:
A 42U rack containing power supplies, cabling and switches
A scalable storage unit (SSU) containing Lustre® Object Storage Servers and Object Storage Target drives
A metadata management unit (MMU) containing the Lustre Metadata Server (MDS), the Lustre Management Server (MGS) and Metadata Target (MDT) drives
A system management unit (SMU) contains an HA pair of embedded servers for file system management, boot and storage
Two 36-port EDR InfiniBand or 100/40 GbE network fabric switches
Two gigabit Ethernet management switches
The Sonexion system comes preconfigured and optimized for scaling Lustre without redesign. Cray’s Lustre design ensures optimum performance configurations across the spectrum of Lustre: from initial deployments to multipetabyte file systems. Storage operators simply add SSUs (capacity and performance) or expansion modules to meet the performance and capacity objectives of the storage system. Each file system will vary based on number of SSUs and expansion modules to meet the individual bandwidth and usable capacity requirements for each storage system. Each 42U rack comes with two InfiniBand switches for linking the MMU, SSUs and Lustre clients. The rack also contains all power supplies, InfiniBand and Ethernet cabling, and a dual gigabit Ethernet switch for management system connections to individual components.
A base MMU and SMU is configured in a 2U 24-bay 2.5-inch drive enclosure with 22 10K RPM disk drives and two solid-state drives. A 5U MMU is also available with up to 80 15,000 RPM SAS drives, enabling higher metadata performance.
A base SSU is housed in a 5U 84-bay 3.5-inch drive enclosure with 80 drives used to provide data storage in an 8 x (8+2) RAID6 target configuration, two global hot spares and two solid-state drives. An SSU expansion enclosure, which has the same drive configuration as the base SSU, can be added to double the usable capacity for a given bandwidth.
Cray Sonexion Software
The Cray Sonexion System Manager (CSSM) software application simplifies the experience of deploying and managing a Lustre file system. Graphical and command line interfaces — integrated with third-party tools — provide system administrators and users an intuitive interface to deploy, monitor and optimize the entire system.
CSSM provides comprehensive status and control of all system components, including storage hardware, RAID, operating system and the Lustre file system in an integrated, easy-to-use administrator interface. A web client hosted on one of the dual controller modules in the metadata management unit (MMU) interfaces with all distributed system manager component services. CSSM also integrates a comprehensive set of community-developed tools to collect, index and analyze fast-moving data.
The MMU module hosts the metadata and management server operations. SSU modules operate as active/active-integrated server modules with redundant and independent system interconnections, providing maximum reliability while delivering maximum performance. The CSSM is tightly integrated into the system stack — from storage and embedded server modules to the Lustre file system and the entire storage cluster — enabling rapid, accurate monitoring and diagnosis down to the component level. Systemwide software and firmware upgrades are executed through a simple and single interface in the CSSM system, removing the complexity and risks of traditional large Lustre implementations.
With its strategic Lustre partners Intel and Seagate, Cray proudly co-founded and currently leads OpenSFS, a technical organization focused on moving Lustre forward and ensuring the success of parallel open-source file-system technologies.
Cray® Sonexion® Technology
Brought to you by Cray, the world’s leading experts in large-scale parallel storage solutions for HPC and technical computing, the Sonexion® system provides a fully integrated, scale-out Lustre® storage system for industry-standard Linux® compute clusters. The Sonexion system’s modular and compact form factor provides precision performance and capacity scalability to reduce capital costs. Performance scales in modular building blocks, reducing the number of hard drives needed to achieve sustained performance at scale. Moreover, the Sonexion system scales in predictable, performance-optimized configurations, ensuring precise levels of performance and stability as capacity expands.
The Sonexion system's compact form factor reduces the total storage hardware infrastructure (cables, servers, components and racks) required for sustaining production-grade, petascale deployments — in some cases up to 300 percent over on-site-designed Lustre configurations.
The Cray Sonexion scale-out system maximizes the performance and capacity scaling capabilities of the Lustre file system. This integrated and modular storage solution is composed of high performance scalable storage units (SSUs), a metadata management unit (MMU), a systems management unit and a network-ready rack that includes all storage and processing needs for a complete, production-grade parallel storage system.
Each SSU is physically capable of delivering 9 to 14 GB/s of sustained bandwidth, depending on the drive type. Using the IOR benchmark, Cray’s performance team has benchmarked real-world, sustained file system performance at 14 GB/s per SSU using the latest HPC drive technology and up to 1.7 TB/s sustained performance to a single file system.
Reduce TCO by 25 Percent
Every Sonexion system’s rack comes pre-assembled, integrated, configured and tested. More data and performance can be stored in modular building blocks. For large-scale systems, the Sonexion system’s compact design ensures greater efficiency and higher utilization within an individual rack — and across the datacenter. This power efficiency and extreme density for Lustre reduce the cost of operating petascale storage systems. In large petascale systems, the datacenter footprint — counted in number of racks for a given performance and capacity level — can be reduced up to one-third the total amount of hardware required to store multipetabyte systems.
This translates into delivering more capability in less space. Compared to competing virtual and SAN-based Lustre offerings, the Sonexion system delivers 38 percent more high-throughput performance in the same space as the competition.
Moreover, Cray Sonexion utilizes power and cooling more efficiently, reducing consumption by 15 to 20 percent compared to the competition. The net effect is that TCO is reduced by about 25 percent compared to the competition when considering all costs over the first three years of use.
Easier to Deploy and Manage at Scale
Because the Sonexion storage system comes pre-integrated, pretested and preconfigured, deployment time is greatly reduced. All components — networking, storage, RAID, operating system and Lustre — come preconfigured and precabled. There are no external servers, switches or ad hoc systems to manage. Simply add storage building blocks — SSUs, expansion modules and racks — to achieve prescribed performance and capacity objectives. Cray’s expertise integrating and configuring Lustre is embedded in the design, enabling customers to focus on research instead of Lustre.
Quality: Reliability, Availability, Serviceability and Stability
Cray’s expertise in designing, deploying, optimizing and supporting large-scale parallel storage systems has enabled us to deliver a highly reliable, available and stable clustered storage system. The modular and redundant architecture of Sonexion systems provides the highest reliability and resiliency to Lustre storage solutions for HPC and technical computing. In addition, each component, module and subsystem undergoes exhaustive factory testing under the most demanding test conditions to ensure maximum robustness. Cray’s quality assurance team simulates real-world thermal and failure stress testing matching worst-case production scenarios.
Lustre® is the file system of choice for the world’s fastest supercomputers, powering over 60 percent of the world’s top 100 supercomputing systems. That number is growing and expanding to big data and high performance computing deployments not measured on the Top500® list. Lustre provides the world’s leading storage I/O scalability, and it is chosen by most of our customers.
Cray’s Lustre model ensures stability and validation of real-world customer configurations through extreme testing, software enhancement and collaborative innovation. We aggressively test, validate and update Lustre — delivering changes back into the common Lustre source base through OpenSFS (Open Scalable File Systems), a key consortium for advancing Lustre. As an original founder and board member of OpenSFS, Cray jointly leads and funds development of key features to ensure Lustre's success.
In partnership with Cray’s strategic Lustre partners Intel and Seagate, Cray proudly represents customers solving some of the world’s largest and most challenging problems and deploying Lustre at scale. Through OpenSFS, Cray, Intel and Seagate are together focused on moving Lustre forward and ensuring the success of parallel open-source file system technologies.
High-performance storage and data management solutions drive insight from data — and Cray brings you the workflow-driven storage solutions you need to optimize the entire I/O path from applications to disk.
King Abdullah University of Science & Technology (KAUST) in Saudi Arabia selected a Cray system that includes a Cray XC40 supercomputer with DataWarp technology, a Cray Sonexion storage system, Tiered Adaptive Storage (TAS) and Cray’s Urika-GD graph analytics appliance.