Scalable Hierarchical Algorithms for eXtreme Computing 2016 workshop

Participants at the SHAXC 2016 workshop enjoy discussion in small groups on computing issues critical to the world today.

From May 9 to 11, the University's Extreme Computing Research Center (ECRC) hosted the Scalable Hierarchical Algorithms for eXtreme Computing (SHAXC) 2016 international workshop on the KAUST campus. Participants from academia, the computer industry and industries that rely heavily on computation gathered to explore algorithmic road maps that promise to sustain the exponential increases in computational power that scientists and engineers have come to expect to improve the accuracy and fidelity of their simulations.

SHAXC was sponsored by the KAUST Office of Sponsored Research (OSR) and the KAUST Industry Collaboration Program and was organized by the ECRC. The ECRC’s mission is: (1) to perform basic research and develop algorithms and software that will enable today's applications to migrate to the exascale systems that are expected by early next decade; and (2) to enable the University's scientific and engineering simulation campaigns to exploit today's state-of-the-practice petascale systems, such as the KAUST supercomputer Shaheen XC40, which is currently ranked at #9 on the list of the world’s TOP500 supercomputers.

Computing power

Computational power delivered to real-world applications—such as reservoir dynamics, molecular dynamics and aerodynamics—has increased by a factor of 1,000 over the past two decades. This has improved the predictive power of simulations, reducing design times and failures and lowering the cost of experiments. In these and other domains, to out-compute is to out-compete. However, this coveted acceleration is beginning to slow. 

Many are convinced that the evolving computing hardware used throughout high-end simulations and data analytics requires a new software infrastructure. Moore’s law of semiconductor technology is now maintained in the face of physical constraints by expanding processor count rather than increasing processor speed. Memory capacity and memory bandwidth fail to keep pace, so little new performance is realized for practically important algorithms that are often already memory-bound. In addition, the energy costs per operation are decreasing more slowly, which extrapolates to computers that we can engineer and afford to acquire but not to power up. 

​Mathematics for a new software infrastructure

Seeds of a new scientific software infrastructure were planted decades ago, and much of the mathematics has been developed. How effectively, however, does the math map onto the increasingly hierarchical memory systems and dynamic runtime systems offered to today’s programmers? International experts joined in to discuss this topic at SHAXC.

"Scalable” refers to algorithms that can, for example, keep the time of a simulation fixed as the resolution of the simulation grows, by adding processors (“workers”) in proportion to the work. Not all algorithms are scalable because some tasks depend upon others to be completed first.

“Hierarchical” refers to algorithms that obtain scalability by considering a single problem at a hierarchy of representations of different sizes, attempting to accomplish most of the progress on the coarser, smaller and therefore cheaper representations while ultimately obtaining the accuracy of the finest representation.

Five of these algorithms have been popularized over the past 60 years (roughly one per decade), including the Fast Fourier Transform; the Multigrid method; the Fast Multipole method; Sparse Grids; and Hierarchically Low-rank Matrix methods (or “H-matrices”). KAUST organized SHAXC workshops in 2012 and 2014 that highlighted the mathematical interplay between these methods. These earlier workshops suggested investing KAUST resources in developing high performance implementations of two of the algorithms. 

A computing expert speaks to the audience at the University's SHAXC 2016 workshop. ​

Opportune workshop timing

The SHAXC’16 workshop took place at a time when there are significant starts around the world on the software implementations of Fast Multipole and H-matrices that will be critical to the performance of many scientific codes as kernels in simulations based on formulations of partial differential equations, integral equations, interacting particles, covariance matrices in statistics and beyond on future computer systems.

Successfully migrating these kernels to the computational environment of the exascale—quadrillions of bytes of memory and quadrillions of operations per second—will create paths that many full applications can follow, just as, a generation earlier, standardized dense and sparse linear algebra libraries led full scientific applications into efficient use of today’s distributed memory computers.

Exascale challenges

The challenges of the exascale include reducing communication; reducing synchronization; increasing concurrency to exploit efficient manycore and GPU processing elements; and incorporating resilience into the algorithms themselves, rather than leaving the entire burden of resilience to the hardware.

Advances in these areas simultaneously reduce time to solution and rate of power consumption. There is evidence that fast multipole methods and their H-matrix cousins fare well on emerging architectures. Even apparently highly synchronous methods like multigrid have defied pessimistic prognoses and scaled to the edge of today’s hardware by continued algorithmic adaptations.

It was therefore opportune to bring together leading developers and practitioners of hierarchical algorithms of diverse stripes in one workshop aimed at sharing knowledge, practice, projections about architecture targets and code—possibly inspiring one of today’s young scholars to discover the next scalable hierarchical algorithm!

- By David Keyes