Beyond spatial scalability limitations with a massively parallel method for linear oscillatory problems
International Journal of High Performance Computing Applications
This paper presents, discusses and analyses a massively parallel-in-time solver for linear oscillatory PDEs, which is a key numerical component for evolving weather, ocean, climate and seismic models. The time parallelization in this solver allows us to significantly exceed the computing resources used by parallelization-in-space methods and results in a correspondingly significantly reduced wall-clock time. One of the major difficulties of achieving Exascale performance for weather prediction is that the strong scaling limit – the parallel performance for a fixed problem size with an increasing number of processors – saturates. A main avenue to circumvent this problem is to introduce new numerical techniques that take advantage of time parallelism. In this paper we use a time-parallel approximation that retains the frequency information of oscillatory problems. This approximation is based on (a) reformulating the original problem into a large set of independent terms and (b) solving each of these terms independently of each other which can now be accomplished on a large number of HPC resources. Our results are conducted on up to 3586 cores for problem sizes with the parallelization-in-space scalability limited already on a single node. We gain significant reductions in the time-to-solution of 118.3 for spectral methods and 1503.0 for finite-difference methods with the parallelizationin-time approach. A developed and calibrated performance model gives the scalability limitations a-priory for this new approach and allows us to extrapolate the performance method towards large-scale system. This work has the potential to contribute as a basic building block of parallelization-in-time approaches, with possible major implications in applied areas modelling oscillatory dominated problems.
The authors gratefully acknowledge the Gauss Centre for Supercomputing e.V. (www.gauss-centre.eu) for funding this project by providing computing time on the GCS Supercomputer SuperMUC at Leibniz Supercomputing Centre (LRZ, www.lrz. de). We also acknowledge use of Hartree Centre resources in this work on which the early evaluation of the parallelization concepts were done.
This is the author accepted manuscript. The final version is available from SAGE Publications via the DOI in this record.
First Published February 3, 2017