Seymour Cray, the pioneer of supercomputing, famously asked if you would rather plough a field with two strong oxen or 1024 chickens. The question has since been answered for us: power restrictions have driven CPU manufacturers away from “oxen” (powerful single-core devices) towards multi- and many-core “chickens.” An exascale supercomputer will take this a leap further, connecting tens of thousands of many-core nodes, leaving application programmers with the challenge of efficiently harnessing the computing power of tens of millions of threads.
This challenge is not just for the applications themselves, but everything underneath — from the operating and runtime systems, through the communication and scientific libraries to the compilers and tools that will all help deliver exascale performance. To tackle these areas, Cray® established the Cray Research Initiative Europe back in 2009. One of the most prominent projects within this initiative is CRESTA (Collaborative Research into Exascale Systemware) where Cray partnered with important high performance computing (HPC) centres and software and tools developers across Europe.
Funded by the European Union, CRESTA is focusing on six applications with exascale potential. Co-design is at the project’s core — real application requirements driving systemware developments and research, which then feed back into the applications in an ongoing, virtuous cycle. Selected by CRESTA HPC center partners (EPCC (formally known as Edinburgh Parallel Computing Centre), HLRS, ECMWF, CSC Finland, PDC-KTH and DLR), the applications represent a broad range of domains including CFD, numerical weather prediction, biomolecular systems, fusion energy, and physiological flows. The CRESTA work then includes using new programming models (such as PGAS languages and OpenACC) and improved libraries (such as FFTs and sparse matrix operations), as well as introducing fault tolerance in the applications and, under the hood, in the communication libraries. Improved compilers, workflow, and diagnostic tools (for example, DDT and Vampir from partners Allinea and TU Dresden) allow the application developers to identify and remove the bottlenecks and span the huge performance gap between petascale and exascale.
A key enabler of CRESTA’s work is access to the many large Cray supercomputers installed at the CRESTA partner sites in Europe as well as the Cray® XK7™ “Titan” system at Oak Ridge National Laboratory through the U.S. Department of Energy INCITE program. Three of CRESTA’s partner co-design applications have already benefited from the INCITE program to showcase the advances already made.
EXAMPLES OF HOW CUSTOMERS BENEFIT FROM CRESTA
Numerical Weather Prediction
ECMWF uses the Integrated Forecast System (IFS) model to provide medium-range weather forecasts to its 34 European member states. Today’s simulations use a global grid with a 16 km resolution, but ECMWF expects to reduce this to a 2.5 km global weather forecast model by 2030 using an exascale-sized system. To achieve this, IFS needs to run efficiently on a thousand times more cores. The CRESTA improvements have already enabled IFS to use over 200,000 CPU cores on Titan. This is the largest number ever used for an operational weather forecasting code and represents the first use of the pre-exascale 5 km resolution model that will be needed in medium range forecasts in 2023. This breakthrough came from using new programming models to eliminate a performance bottleneck. For the first time, the Cray Compiler Environment (CCE) was used to nest Fortran coarrays within OpenMP, absorbing communication time into existing calculations.
Computational Fluid Dynamics (CFD)
CFD applications are already some of the world’s largest users of supercomputers and this importance is highlighted by CRESTA including two examples: OpenFOAM® and Nek5000. CRESTA researchers have been using the OpenACC programming model to extend the existing Nek5000 code to portably and productively exploit accelerators. Adding only one OpenACC directive per thousand lines of Fortran code has already allowed a Nek5000 test case to be efficiently scaled across more than 16,000 GPU nodes of Titan, with a near-threefold increase in performance compared to just using the CPUs.
Gromacs is a classical molecular dynamics package for simulating the behaviour of millions of particles either in biochemical molecules (like proteins, lipids, or nucleic acids) or non-biological systems, such as polymers. The CRESTA work has focused on efficient, simultaneous exploitation of CPUs and GPUs with a scientific drive to understand the mechanism of membrane fusion in viruses.
The work doesn’t only help Gromacs; the interaction between the CRESTA partners and Cray R&D has driven some big improvements in the Cray Compiler Environment (CCE) that will benefit Cray users worldwide.
With another year still left to run, we can expect more big improvements in applications, systemware, and tools from CRESTA. But the exascale challenge isn’t finished and, with follow-on projects like EPiGRAM starting, Cray Europe has already started work on the next step in harnessing all those chickens to the exascale plough.