“Titan” Supercomputer: More than an Engineering Marvel

Like Cray, the Oak Ridge Leadership Computing Facility (OLCF) at Oak Ridge National Laboratory (ORNL) is dedicated to solving the world’s toughest problems using supercomputers. Perhaps that’s why Cray and ORNL have a relationship dating back three decades to when the facility’s first Cray supercomputer, a Cray® X-MP™, came to East Tennessee in 1985.

After four years of administering Cray systems for the U.S. Air Force in the early ‘80s, I came to Oak Ridge to help run the X-MP system. Since then, I’ve seen a long line of Cray supercomputers cycle through ORNL. Each new generation of systems has improved upon its predecessors in small — and oftentimes big — ways that helped advance ORNL’s broad science and energy mission.

The most recent example of this steady progression sits across the hall from my office at the OLCF: the Cray® XK7™ system we call “Titan,” the second-fastest supercomputer in the world. In addition to the Titan supercomputer being an impressive feat of engineering, I like to think of it as a testament to the many years of ingenuity and dedication of OLCF staff members and longtime partners like Cray.

Six years ago, Titan’s resourceful GPU-CPU architecture was conceived by the OLCF and Cray as the answer to two seemingly incompatible objectives: a significant increase in computational power for a minimal increase in energy consumption. Though some outsiders doubted our approach, the collaboration between the OLCF, Cray and NVIDIA resulted in a hybrid machine that lived up to the billing, a performance-oriented, highly parallel system capable of enabling cutting-edge science. The development of the Titan supercomputer marked a new paradigm in heterogeneous computing.

Entering its third year of operation, Titan continues to prove its value. As a user facility for open science, the OLCF makes the Titan supercomputer available to researchers from around the world through initiatives such as the Innovative and Novel Computational Impact on Theory and Experiment, or INCITE, program.

Jointly managed by the OLCF and Argonne Leadership Computing Facility (ALCF), INCITE connects researchers in need of anything from capability computing to leadership computing resources, along with the computing experts who can help them make the most of the systems. Recent INCITE teams have leveraged Titan to achieve scientific milestones in a variety of fields, including breakthroughs in combustion physics by Jacqueline Chen of Sandia National Laboratories, superconductive materials by Paul Kent of ORNL and astrophysics by Salman Habib of Argonne National Laboratory, among others.

Last year, INCITE allocated more than 5 billion core-hours. This year, the program expects to award approximately 6 billion core-hours split between the ALCF’s “Mira” and the OLCF’s Titan supercomputer, with awards per project ranging from tens to hundreds of millions of core-hours. INCITE is currently accepting proposals for 2016.

Titan’s ongoing success builds on past accomplishments shared by the OLCF and Cray dating back to the formation of ORNL’s leadership computing facility in 2004. The designation led to an upgrade of the then-flagship system “Phoenix,” a Cray® X1E™ system, and an agreement to build “Jaguar,” a Cray® XT3™ system based on the Red Storm architecture designed by Cray and Sandia National Laboratories.

When Jaguar debuted at the OLCF in 2005, it was a 25-teraflop system with a few thousand single-core processors. In its final iteration in 2009, the Cray® XT5™ supercomputer Jaguar contained nearly 300,000 cores and boasted a peak performance of 2.3 petaflops, nearly a 100-fold increase in computational power over the XT3 supercomputer. Between upgrades, Cray provided the know-how needed to deliver the world’s most powerful supercomputers leveraging the economy of commodity processors.

Equally important to the OLCF and its users are Cray’s efforts to help application developers scale their codes to perform on each new system through collaborations like the Cray Supercomputing Center of Excellence at ORNL. When the  Center debuted in 2005, it was the first of its kind. Today, it’s a model other leadership computing facilities follow to ensure application codes maximize next-generation supercomputers’ capabilities.

High performance computing has changed in many ways since ORNL hosted the Cray X-MP supercomputer 30 years ago, but the role of supercomputers at the OLCF remains the same: to enable scientific discovery. By that measure, the OLCF and Cray share an unparalleled track record, one that continues to remap the boundaries of science and technology.

For more information about INCITE or to submit a proposal, go to https://proposals.doeleadershipcomputing.org/allocations/calls/incite2016.

For information about the science and systems of the OLCF, go to http://www.olcf.ornl.gov.

 

Speak Your Mind

Your email address will not be published. Required fields are marked *