Accelerating high-order mesh generation with an architecture-independent programming model
Computer Physics Communications
© 2018 The Authors. Published by Elsevier B.V. Open Access funded by Engineering and Physical Sciences Research Council. Under a Creative Commons license: https://creativecommons.org/licenses/by/4.0/
Heterogeneous manycore performance-portable programming models and libraries, such as Kokkos, have been developed to facilitate portability and maintainability of high-performance computing codes and enhance their resilience to architectural changes. Here we investigate the suitability of the Kokkos programming model for optimizing the performance of the high-order mesh generator NekMesh, which has been developed to efficiently generate meshes containing millions of elements for industrial problem involving complex geometries. We describe the variational approach for a posteriori high-order mesh optimisation employed within NekMesh and its parallel implementation. We discuss its implementation for modern manycore massively parallel shared-memory CPU and GPU platforms using Kokkos and demonstrate that we achieve increased performance on multicore CPUs and accelerators compared with a native Pthreads implementation. Further, we show that we achieve additional speedup and cost reduction by running on GPUs without any hardware-specific code optimisation.
JE gratefully acknowledges the support through EPSRC and the President’s Scholarship of Imperial College London. MG acknowledges support from the PRISM project under EPSRC grant EP/L000407/1. MT acknowledges Airbus and EPSRC for funding under an industrial CASE studentship. DM acknowledges support from the EU Horizon 2020 project ExaFLOW (grant 671571). The Quadro P5000 GPU used for this research was kindly donated by the NVIDIA Corporation.
This is the author accepted manuscript. The final version is available from Elsevier via the DOI in this record
Published online 5 April 2018