In a blog post last month, Cray’s Jay Gould examined the critical role of software in a supercomputer. In this latest software series, David Wallace, Cray’s Director of HPCS Software Product Management, provides a thorough understanding of OpenACC, its programming benefits and how the industry coalition is evolving.
Why does OpenACC exist?
OpenACC was established for several reasons; however, the most compelling reason is that users needed an industry standard set of accelerator directives immediately and the OpenACC founding members each had directive sets that were somewhat alike. PGI and CAPS had both already release their own set of directives and Cray was preparing to release a set as well. All three of these directive sets were different and unique to their respective members. This plethora of directive sets that did similar things but were spelled differently just lead to a confused user committee that wondered if they should just wait for OpenMP to finish working on their directive sets. This position was worrisome to both vendors and users alike; waiting for a large organization to finish their work meant delayed adoption of a programming model that could effectively support heterogeneous systems that were already being deployed.
Briefly, what is OpenACC all about? What problem is OpenACC attempting to solve?
OpenACC is about giving programmers a set of tools to port their codes to new heterogeneous system without having to rewrite the codes in proprietary languages. The overhead of having to rewrite code to run on both homogeneous and heterogeneous systems was sufficiently high that many users were unwilling to consider undertaking the process. Directive sets like OpenACC are nice because they allow the source code to remain predominately unchanged and yet be targeted at both homogeneous and heterogeneous systems. Because the directives can be inserted into C, C++ and Fortran codes it is possible for users to avoid the difficulty of rewriting their codes in a new language which may be less robust than their language of choice.
What organizations are involved in OpenACC?
This question is interesting as there are the “founding” members and then there are the “new” members. OpenACC was founded by four interested companies, – CAPS, Cray Inc., Nvidia and PGI. These four companies are responsible for the original OpenACC 1.0 specification and the formation of the OpenACC organization. After its founding the OpenACC organization began to grow and is now made up of surviving founding members as well as numerous corporate, government and academic organizations. The list of “partners” is currently 11 members excluding the founding members. Some of the “first” partners were CSCS, University of Houston, and Technical University Dresden. A full roll call of current members would be: CAPS, Cray, CSCS, EPCC, Georgia Tech, University of Houston, Indiana University, Nvidia, Oak Ridge National Laboratory, PGI, Technical University Dresden, and Tokyo Institute of Technology. Both Allinea and Rogue Wave are supporters but I do not believe they are as of yet partners.
How is OpenACC evolving?
To look at where OpenACC is going we should look at where OpenACC came from. The first version of OpenACC was the culmination of years of research and development. The three main vendors involved CAPS, Cray and PGI all had developed their own directives and were working actively with the OpenMP language committee on extensions to OpenMP to support accelerators from a proposal submitted by Cray. Unfortunately, the speed at which a committee like OpenMP must move is much slower than the market was willing to endure. To satisfy demand the directive sets from PGI and Cray were merged into a single coherent set of directives that benefited from the experiences of the member when developing their directive sets.
OpenACC 2.0, which was just released, evolved along three fronts, corrections and clarifications, user driven features and device driven features. The corrections and clarifications often were needed because the vendors had differing opinions on what the spec meant resulting in implementations that were not consistent with each other. An example of a user driven addition would be the addition of unstructured data lifetimes. These were found to be important in order to enable any reasonable level of support for C++. An example of a device or vendor driven addition would be support for function calls driven by the fact that we now had at least two targets with linkers, Intel PHI and NVIDIA Keplers.
OpenACC will continue to evolve along these same lines. The feature that Cray is currently most involved in is support for moving more complex objects to the device. This feature is intended to support arrays of objects that contain dynamic arrays, pointer based arrays. This is a feature that has been identified as useful by customers. Part of the feature will also enable programmers to only move the parts of a complex object that are needed, with some constraints. More on this subject will be provided in the future.
Why is Cray involved in OpenACC? What does Cray see as benefits of being a member of OpenACC?
Cray is involved in OpenACC for several reasons. The first reason was that we were preparing to release our accelerator directives and it did not seem beneficial to our users to release yet another incompatible set of directives. OpenACC gave us the perfect fast path solution to this problem. We continue to be involved with OpenACC because the focus of the specification allows it to move much more quickly than something like OpenMP which has a much larger set of directives to worry about. I would say the primary benefit we gain from being a member of OpenACC is that we can influence the direction of the specification and try to ensure that our users’ needs are met in a timely manner.
Why doesn’t Cray just wait for the specification to be released?
If Cray were not involved with OpenACC we would have to wait until the specification were released in some form to begin developing our support in the compiler. This can mean years of lag time from when a version is released and when it is supported. By actively participating in the development of the specification we can continually evolve our implementation so that we can release support for a spec in a timelier manner. We feel this serves our user needs better since they are trying to use our new supercomputers which of course are capable of being heterogeneous systems.
How does OpenACC impact HPC programmers?
As alluded to earlier, OpenACC allows HPC programmers to worry more about the problem they are trying to solve and less about the language and hardware they are using to solve the problem. By enabling parallelism via directives OpenACC lowers the bar for entry into heterogeneous computing since the developer can begin porting their codes to heterogeneous systems while maintaining a source code that still runs well on a homogeneous system.
What other compiler directive sets is Cray involved with?
Cray has been a member of the OpenMP ARB since 2006, if I recall correctly. OpenMP is the industry standard mechanism for exploiting shared memory level parallelism on a node. With the release of OpenMP 4.0 support for accelerators was added. Cray was an active participant in the development of these additions; we provided the original proposal and co-chaired the subcommittee on the topic.
In the battle of compiler directives, how does the OpenACC API differ from the OpenMP API? How are they similar?
OpenACC and OpenMP are very similar because OpenACC was developed in part from the prototype work Cray used to help drive the OpenMP subcommittee work. Thus the memory and execution models are nearly identical. However, because OpenACC split off from OpenMP more than a year before OpenMP 4.0 was completed there are some significant differences. The primary differences are that OpenACC intentionally limits what a programmer can do on a device to help ensure that codes are performance portable. OpenMP chose the route of giving the programmer the freedom to use all existing OpenMP directives on the accelerator. Thus it is possible to write OpenMP programs that run perfectly well on one device and cannot run on another device at all because of limitations on the device. An in-depth analysis of the differences between OpenACC and OpenMP will be given at a later date.
Will OpenACC remain relevant now that OpenMP 4.0 has been released?
OpenACC and OpenMP solve the same problem but they solve the problem in very different ways. Because of this difference in philosophy I believe OpenACC will continue to offer features that OpenMP may never offer. Also because OpenMP has a larger user base and a larger existing directive set to deal with, the development and approval of new versions of the OpenMP specification takes more time. Thus, OpenACC will be able to evolve more rapidly and provide a proving ground for features that OpenMP may want to adopt in the future.