Cray System Snapshot Analyzer

Over time, large and complex supercomputing systems are statistically susceptible to component issues, and failures are to be expected. The Cray® System Snapshot Analyzer (SSA) proactively addresses system health with automated and remote system status checks, or “snapshots.” When failures and issues do occur, the snapshots can expedite solution turnaround time. Using the SSA for system query and data collection can dramatically improve the detection of issues and minimize the typical user overhead involved with communicating symptoms to the Cray support team.

Regular and automated system health maintenance snapshots can provide insights that predict future behavior and potential triage responses, often before issues reach critical states. When those statistically probable component failures do occur, the System Snapshot Analyzer helps accelerate the support diagnosis and time-to-resolution.

The Cray System Snapshot Analyzer enables faster issue response, less user overhead and more personalized customer support, while also delivering improved uptime management and TCO.

SSA identified several failures on our system before we noticed them. It was very handy to have the information prepared for Cray support personnel to review and respond to without having to perform manual steps.

– Liam Forbes, UAF RCS HPC Systems Analyst / GI ARSC Interim Director

Cray System Snapshot Analyzer Product Brief

Cray System Snapshot Analyzer White Paper

What Cray products does the System Snapshot Analyzer support?

System Snapshot Analyzer currently supports Cray® XC40™, XC30™, XK7™, XK6™ and XE6™ supercomputer systems and the Cray® Sonexion® scale-out Lustre® storage system.

How do I get the Cray System Snapshot Analyzer?

Cray customers/users can download the SSA client via CrayPort. If the System Snapshot Analyzer package is available for your product it should be visible to you. When downloading you’ll have to accept this license.