A few dozen Chapel enthusiasts recently devoted a weekend to CHIUW 2015 — the second annual Chapel Implementers and Users Workshop. CHIUW (pronounced “chew”) is a forum for developers and practitioners of the Chapel parallel programming language to meet and share results while strategizing about the language’s future. This year’s workshop was held in Portland, Oregon, as part of the 36th annual ACM SIGPLAN conference on Programming Language Design and Implementation (PLDI). Between events, attendees made good on the CHIUW name by taking advantage of Portland’s food scene at Frank’s Noodle House, the Red Star, Blossoming Lotus, Burnside Brewing Company, Salt and Straw, Voodoo Doughnuts (natch) and Tilt (get the pie shake!).
CHIUW 2015 was an invigorating event for the Chapel community. It underscored the breadth of community interest in Chapel, from traditional HPC users to potential use cases in the mainstream and data analysis communities. While it is clear that Chapel performance needs to continue improving, particularly to attract today’s MPI users, other communities for whom time-to-solution is more important are ready to use Chapel now, and are also interested in contributing code and documentation back to the broader community.
Here’s a summary of the highlights from CHIUW 2015:
Day one followed a mini-conference format, featuring submitted technical talks that were selected by CHIUW’s program committee. It kicked off with a two-part introductory talk from Cray that provided background and context for attendees. Part one, “Chapel Boot Camp,” provided a brief overview of Chapel for those new to the language. Part two provided a “State of the Project” overview, summarizing recent progress and achievements. This talk noted a recent uptick in Chapel interest, some of which was generated by Jonathan Dursi’s recent “HPC is Dying, and MPI is Killing it” blog post. This talk wrapped up with references to exciting recent efforts by users unable to attend CHIUW, such as Primordial Machine Vision Systems’ release of a 154-page guidebook called “Chapel by Example.”
Michelle Strout (Colorado State) gave the first contributed talk, describing notable work that her team published at ICS 2015 studying parameterized diamond-shaped tilings. Their results show that Chapel performs competitively with C+OpenMP while providing significant software engineering benefits due to its native support for parallel iterators. Next, Sparsh Mittal (ORNL) described his published study, Comparing Chapel, Go and D, for successive over-relaxation (SOR) computations. His results demonstrate that Chapel provides a clean representation of the algorithm that outperforms his Go and D versions.
Brian Guarraci, a senior staff software engineer at Twitter, gave a talk about his spare-time experimentation with Chapel on ARM clusters. He’s been exploring what it would take to build service-oriented architectures (SOA) in Chapel — a usage model that is very different from the scientific computations for which Chapel was originally designed. Brian’s talk provided rich fodder for code camp discussion on day two.
The other three full-length talks were given by members of the Cray Chapel team. Greg Titus described Chapel’s Hierarchical Locale Models, designed to make the language future-proof against emerging compute node architectures. Ben Harshbarger summarized his recent locality optimizations, which simplify Chapel’s generated code, improving distributed memory performance. Elliot Ronaghan reported on his recent work supporting vectorization of data parallelism in Chapel.
Bill Carlson’s (IDA) inspiring keynote talk, entitled “Shared Memory HPC Programming: Past, Present, and Future”, started with a historical perspective, citing the value of the Cray® T3D™ system and UPC language as instances of making HPC programming more like traditional desktop programming. He went on to lament the small number of applications that run on large-scale systems today, citing the lack of productive parallel programming models as a limiting factor not just for developing new software, but also for experimenting with new algorithms. He went on to posit that, as large-scale systems become increasingly complex, interest in HPC may decline simply due to the small number of programmers who will be able to use HPC systems effectively. Bill’s keynote wrapped up with a call for increased collaboration within the parallel language community.
Hot topics talks
The first day’s presentations wrapped up with a lively “Hot Topics” session, composed of six rapid-fire 10-minute talks. In that spirit, here’s the quick rundown:
- Michael Ferguson (Cray) described our plans for formalizing Chapel’s memory consistency model. Takeaway: It’s similar to C++11, with enhancements for Chapel-specific features.
- Thom Popovici (CMU) described his efforts to write efficient, native FFT routines in Chapel. Takeaway: Chapel’s features suit FFTs well, but more performance tuning is required.
- Laura Brown (ERDC) described a Fortran+MPI vs. Chapel study that she undertook. Takeaway: Chapel’s reductions have a (known) serious scalability issue, but she sees benefit in using Chapel once performance becomes more competitive.
- Jens Breitbart (TUM) shared user experiences from dataflow programming in GASPI. Takeaway: Users are open to MPI alternatives if performance is good and the switching cost is not too high.
- Dylan Stark (Sandia) shared some experimental results illustrating the effect of runtime configuration options on performance. Takeaway: Runtime options can have a significant impact, but end users are often unaware of them.
- Mauricio Breternitz (AMD) gave a report on mapping Chapel to HSA via its eXtended Task Queueing Model (XTQ). Takeaway: XTQ supports Chapel’s communication interface well and results in speedup.
Chapel Code Camp
For the second day of CHIUW, we split into a half-dozen two- to four-person groups, each of which explored a coding challenge or design question posed by an attendee. Topics included: authoring active libraries in Chapel; expressing data processing workloads in Chapel; expressing native FFTs in Chapel; targeting HSA optimally via data transfer optimizations and code generation improvements; providing standard libraries for linear algebra, plotting and image processing; and supporting ZeroMQ within Chapel. Each of the groups made significant progress in their respective area.