The approaching exascale era has produced many new question marks for the high-performance computing industry, and one of the largest has been around the development of more efficient and economical power and cooling technologies in the data center. To solve exascale-sized problems, HPC systems will need to run at power levels well beyond what we see even with today’s largest supercomputers, which will make operating, maintaining and cooling these systems exorbitantly expensive. Cooling systems have always been one of the largest consumers of power in the data center, and a collective shift to new, more efficient methods will be required to support the next generation of HPC systems.
Today’s HPC data centers are constrained in terms of the amount of total power they can provide. In a perfect world, close to 100% of total available power would be used for computing, but the reality is that a large amount currently goes toward cooling, both at the facility and rack level. Traditional air-cooled HPC systems can use up to 20% of total available power for facility cooling and an additional 15% for cooling components within the rack, leaving only 65% of total power to be used for actual computing.
Using 35% of available power for cooling alone is an equation that simply won’t work at the exascale level.
Over the years, the growing heat output, size and density of each new generation of supercomputers has challenged the high-performance computing industry to continuously develop and adopt new cooling techniques. Systems that are cooled by air have been popular for decades because air cooling is simple to implement and low-cost; however, it also offers the lowest thermal performance and is the least efficient method from an energy perspective. Liquid cooling is beginning to rapidly gain popularity now as water and other fluids (which have higher thermal conductivity than air) prove themselves a much more efficient way to cool larger, high-density systems.
While air cooling has been the preferred technique among original design manufacturers (ODMs), it’s clear that conventional air-cooled methods are becoming inadequate to support a new generation of systems. There needs to be a collective shift within the HPC industry from the traditional front-to-back air-cooled systems to much more efficient alternative cooling solutions. These methods, which will inevitably incorporate some form of liquid cooling, will represent a higher in-rack cooling investment in the short term, but promise to deliver greater energy savings, lower TCO and higher flexibility over the long term.
Cray has a long history of power and cooling innovation that is grounded in decades of HPC expertise. Early iterations of our supercomputers were among the first to incorporate cooling techniques that were groundbreaking at the time, and they paved the way for these techniques to become more mainstream. Drawing on that history of innovation, our latest cooling architecture has been designed from the ground up with flexibility and configurability in mind. Not only do our systems incorporate the very latest liquid-cooled technologies, they also have built-in software that can report how energy is being used and make recommendations for increasing efficiency. Our systems can also make automated adjustments to improve energy usage, such as moving more energy-intensive jobs to off-peak times. We even provide the ability to scale cooling up or down according to the exact needs of the system, so you can cool exactly what you need to without incurring additional overhead costs.
Perhaps most importantly, we take a holistic approach to cooling that isn’t focused solely on advancing our own agenda as a supercomputer manufacturer. We’re researching, designing and implementing forward-thinking cooling system technologies that balance costs, performance and efficiency, and the result is not only real energy savings for data center facilities, but also quantum leaps forward for the HPC industry as a whole. Our latest technologies have the potential to reduce the amount of total power used for data center cooling from 20% to as low as 3%, and the power used for in-rack cooling from 15% to 2%. This is a much more powerful equation to carry us into the exascale era.
Join us Wednesday, October 10, for a live online chat where we’ll take a look inside the changing face of HPC cooling as we move toward exascale. You can easily participate using a Twitter, LinkedIn or Facebook account.
How to participate:
The chat will be hosted on CrowdChat, which organizes tweets into streams of conversations. We’ll start the conversation thread from the Cray Twitter handle (@cray_inc), and everyone who wants to participate can comment, ask questions, or just watch the discussion from the sidelines. You can participate by logging in with Twitter, LinkedIn or Facebook.